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ABSTRACT 
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PREFACE 

This report presents a detailed description of the sample design 
and estimation procedures employed by th^ Health Exan-dnation Sur- 
vey in a nationwide survey of youths 6-11 years of age in the npn- 
institutional population of the United States. The objective of the 
survey was to collect data which would provide national estimates 
and distributions of various health characteristics related to the 
growth and development of this target population. 

The overall responsibility for the development of the design 
and other sampling aspects of the survey was that of Walt R. Simmons, 
Assistant Director for Research and Scientific Development, National 
Center for Health Statistics (NCHS). Carrie J. Losee, .formerly 
Assistant Statistical Advisor, NCHS, with assistance from Ceorge 
A. Schnack, Office of Statistical Methods, NCHS, shared in the 
planning of the design and was responsible for the development an4 
execution of specific sampling procedures. Innovations in the de- 
sign, such as the Latin-square modification of the controlled se- 
lection techniques, arv' the joint contribution of all three above-named 
persons. The Statistical Methods Division, Bureau of the Census, 
particularly Robert Hanson, deviseb the techniques for, and per- 
formed the ultimate stage selection of, sample segments from 1960 
census listings. 

This report was prepared jointly by the three staff members 
listed as its authors. Much of the report is based upon internal 
unpublished docjuments written by Messrs. Losee and Simmons. 
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SAMPLE DESIGN AND ESTIMATION PROCEDURES 

FOR A NATIONAL HEALTH EXAMINATION 
SURVEY OF CHILDREN 

E. Earl Bryant, Office of Statistical Methods, and James T. Baird, .Tr,, and Henry W, Miller, Division 
t of Health Eocamination Statistics 



INTRODUCTION 

The Health Examination Survey is one of 
the major survey programs employed by. the 
National Center for Health Statistics to obtain 
information about the health status of the U.S. 
population. It is a part of the Nation^^l Health 
Survey, authorized in 1956 by the 84th Congress 
_ as a continuing Public Health Service activity. 

The National Health Survey employs three 
.different survey programs to accomplish its 
objectives.^ One of these is the Health Interview 
Survey In which persons are asked to give in- 
formation related to their health or to the health 
of other household members. The second program , 
Health Resources, obtains health data and health 
resource and utilization information through sur- 
veys of hospitals, nursing homes, and other resi- 
dent institutions and through the entire range of 
personnel in the health occupations. The third 
major program is the Health Examination Sur- 
vey (HES). 

The Health Examination Survey collects data 
from samples of the civilian, noninstitutional 
population of the United States and, by means of 
medical and dental examinations and various tests 
and measurements, undertakes to characterize 
the population under study. This is the most ac- 
curate way to oh^ain diagnostic data on the prev- 
alence of certain medically defined illnesses. 
It is the only way to obtain information on unrec- 
ognized and undiagnosed conditions — in some 
cases, even nonsymptomatic conditions. It is 
also the only way presently available to obtain 



distributions of the population by a variety of 
physical, physiological, and psychological meas- 
urements. Although the sample- is designed pri- 
marily to estimate the prevalence of specified 
health and health-related conditions in the popu- 
lation, the design also makes possible the study 
of relationships of the examination findings toone 
another and to certain demographic and socio- 
economic factors* 

Successive and separate survey programs 
are conducted for specific age segments of the 
population. These programs, referred to as 
"cycles," are concerned with certain specified 
health aspects of that subpopulatiori. Thus, the 
first cycle of the Health Examination Survey was 
conducted between November 1959 and December 
1962 and was directed toward the civilian, non- 
institutional population from ages 18-79 years 
inclusive. The examination was focused primarily 
on certain chronic diseases, principally cardio- 
vascular disease?, arthritis and rheumatism, 
and diabetes. Also included were a dental ex- 
amination, tests for visual and auditory acuity, 
an X-ray, electrocardiographic tracings, blood 
chemistry tests, and numerous body measure- 
ments. The sample size of this cycle was 7,710 
persons, of which 6,672 (86.5 percent) were 
examined. Details of the plan of this initial pro- 
gram^ and reports of various methodological 
studies ^'^^ and of the findings^^ relative to that 
cycle are available. 

The target population of the second cycle 
of the Health Examination Survey consisted of 
children ages 6-11 years inclusive. This cycle 
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became operational in July 1963 and was concluded 
in December 1965, The prinnary focus of the ex- 
amination was on various parameters of growth 
and development, but it also screened for heart 
disease; congenital abnormalities; ear, nose, and 
throat diseases; and neuro-musculo-skeletal ab- 
normalities. The size of the sample of this cycle 
was 7,417, of which 7,119 (96,0 percent) were 
examined. Several methodological reports,^^"^^ as 
well as reports of findings}^ have been published, 
and others are being prepared, 

A detailed report of the plan, operation, and 
response results of the second cycle has also 
been publishedJ'^ While that report does include 
a general description of the sample design, it 
was necessarily limited by the scope of the report. 
It will, therefore, be the object of this report to 
describe in detail the various aspects of the 
sample design and selection procedures , weighting 
techniques used for population estimation, and 
procedures employed for variance estiination, 

PRELIMINARY CONSIDERATIONS 

AND SPECIFICATIONS 
I 

The development of a successful sample 
design must take into account all relevant fac- 
tors and circumstances. In view of the primary 
mission of the Health Examination Survey, this 
means that there must be a blend of primary 
survey objectives, budgetary resources, logis- 
tical considerations, time limitations, organized 
speculation concerning population parameters, 
and unit operating costs. These and other re- 
quirements in Cycle I dictated that a highly 
stratified multistage probability type of design 
be used in contrast to some possible alternative 
of a more subjective or volunteer selection of 
examinees. 

The similarity between Cycles I and II, 
particularly with respect to their broad mission, 
indicated a similar probability^ type of design 
for Cycle II. It should be pointed out, however, 
that while of necessity several features were 
common to both designs considerable statistical 
exploration was carried out for Cycle II to de- 
termine the optimum design with respect to 
sample size, sample allegation, sampling frame, 
and operational procedures. 
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In the early planning stages of Cycle II, 
two problems basic to the sample design re- 
ceived considerable attention. These were tht 
age segment of the population to be examined 
and the sampling frame to be used. The original 
concept was that the age group to be studied in 
Cycle II would be persons ages 6-17 years 
inclusive. As the detailed planning proceeded, 
however, it became apparent that the differences 
between persons in different age segments of 
this population group were sc great that sep- 
arate programs were required. Therefore, it 
was decided to redefine the Cycle II target popu- 
lation as children from ages 6-11 years, in- 
clusive, and to follow this program with a third 
cycle which would have youths 12-17 years, 
inclusive, as its target population. 

Since almost all the population in the age 
group 6-11 years are in school for a large 
part of the time, it was felt that a spmple de- 
sign which used the school populations as an 
element of stratification might have some op- 
erational advantages. For example, if schools 
could be grouped by type (public, parochial, 
private, etc), size, socioeconomic character- 
istics of the students enrolled, and segregation 
factors, a sample of children from one or more 
schools in each group might minimize the num- 
ber of specific locations from which the sample 
children would come. Although some consider- 
ation was given to using the schools in this '^ay 
as a sampling frame, the idea was abandoned 
because of the unavailability of the necessary 
classifxcatory data concerning the schools, dif- 
ficulties anticipated during summer months, an(J 
geographic coverage of nogpublic school children. 
Consideration' was also given to selecting 
an original saniple of 15,000 to 25,000 children 
and to making some of the simpler elements 
of the examination on all, A smaller sample 
would be selected from the original group and 
would be subjected to the additional examination 
and tests requiring more elaborate equipment, 
or procedures. Important advantages of such a 
scheme were that it would permit a two-phase 
seler.tion of the smaller sample and would pro- 
vide poststratifying information that would reduce 
sampling variance. This plan was discarded, 
however, because of the operational problems 
it seemed co present. 



In the final analysis, the sample design of 
Cycle II was developed essentially from a set 
of specifications which took into consideration 
requirements and limitations placed upon it. It 
was important that the requirements be consist- 
ent with survey objectives and :hat the limita- 
tions not be so serious as to materially distort 
the objectives. Specifications of primary impor- 
tance were as follows: 

1. The target population would be ths non- 
institutional population of the United States 

■ from 6-11 years of age, inclusive, with 
one exception. Because of operational 
difficulties experienced in Cycle I, all 
children residing upon any of the reserva- 
tion lands set aside for the use of Ameri- 
can Indians would be excluded. 

2. The data collection mechanism developed 
and proved during Cycle I would be used, 
with appropriate modifications. Examina- 
tions would he conducted in mobile exam- 
ination centers, two of which would be in 
operation simultaneously in different 
parts of the country. 

3. The total period of data collection "for 
Cycle II would be between 2 and 3 years. 
Other time limitations were a maximum 
6-day workweek, a 5-week-per-year loss 
of time due to vacations and holidays, 
and a 7-day loss per move from one ex- 
amining location to another. 

4. The length of an individual examination 
would be between 2 and 3 hours. Approxi- 
mately 12 children would be examined 
per day. 

5. Experienced and qualified personnel in 
the field staff for Cycle I would be re- 
tained to the extent necessary, to perform 
the data collection operation in Cycle II. 

6. The schedule of examining locations or 
stands must take into account the climate, 
especiaily to avoid conducting the survey 
in Northern States during the winter. 

7. Certain cost factor limitations such as 
budget loads projected ^for each of the 
fiscal years 1962 and 1963 must be ob- 
served. 

8. The examination objectives would be 
concerned primarily with factors of , 
physical and mental grov/th and develop-^ 
ment. 



9. Ancillary data would be collected through 
the use of questionnaires. These would 
consist of a household questionnaire, a 
medical history of the • child completed 
by the parent, and an interviewer-ad- 
ministered medical history questionnaire. 
Also, a questionnaire would be sent to the 
school at which the sample child is a 
student. 

10. Maximum target tolerances for sampling 
variability would be set for several key 
statistics, permitting a general analysis 
by broad geographic regions, population 
size groups, and other major subgroups 
such as age, sex. and limited socio- 
economic factors. 

DETERMINATION OF SAMPLE SIZE 

The size of sample required for a survey 
is influenced by a number of factors. These 
include the sample design, estimating procedure, 
confidence-tolerance specifications, variability 
and prevalence of population characteristics to 
be measured, available budget and unit costs, 
and operational constraints placed on the design. 
Once all such factors are determined, and there- 
fore fixed, the sample- size requirement for a 
stratified design will vary depending on how the 
sample is allocated to strata and how the sample 
is clustered witlun strata. In designing Cycle II, 
one such factor examined was how to allocate 
the sample in such a way as to produce, esti- 
mates with minimum variance for a fixed budget. 

One of the design specifications was to per- 
mit analysis by broad geographic regions, llius, 
a first consideration was to divide the population 
of the United States into a number of geographic 
regions approximately equal in population size. 
As explained in greater detail in a later section, 
this resulted in four regions with further strati- 
fication occurring within each. Tlie latter strati- 
fication further produced an equal number of 
strata within each region which in turn were also 
approximately equal in size. Under these con- 
ditions, population variances are often about the 
same magnitude in each stratum. Also, the cost 
of examining an individual should be somewhat 
similar from one examining location to another. 

These features of the design indicated that 
an equal allocation of the sample strata would 
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be approximately optimum. Thus, in determining 
the sample size, the main consideration was how 
to allocate the sample between the first- and 
second-stage units, that is, the number of pri-r 
mary sampling areas or units (PSU's) and the 
number of sample persons per PSU. 

To determine an optimum solution to this 
problem, a cost relationship, B = C^ + C^m + C^mn 
was assumed where 7n = number of PSU*s, n = 
number of sample persons per PSU; B= total 
budget for the survey, 0^= overhead costs, Cj = 
costs associated with a PSU such as travel between 
PSU*s, and = costs associated with persons such 
as cost to examine a person. The optimum values 
of m and n for a two-stage cluster sample design 
which yield estimates with minimum variance for 
a fixed budget are: 



(optimum) 




B- C 



(optimum) ~ r + c n 
H ^2 ^(optimum) 

where and are components of the total popu- 
lation standard deviation due to variation within 
PSU's and between PSU*s respectively. Esti- 
mates of and were computed from data 
collected in a probability sample of 14 PSU's 
completed early in Cycle 1, using the. formulas: 
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where nj is the actual number of sample per- 
sons in the PSU, 

The proportion of the population with a 
specified health characteristic, p\ components 
of variances §^ and S^, sampling error of the 
estimated proportion and optimum values of 
m and n are shown in table A for a number of 
health conditions. The information on which 
these estimates were based was not ideal for 
designing the Cycle 11 sample, since it related 



to adults and not to children; and the health 
characteristics were not the same as chose to 
be considered for -Cycle 11, Thus the assumption 
had to be made that the information about vari- 
ances and unit costs for the survey of adults 
also held approximately true for growth and 
development characteristics of children 6-11 
years of age. 

As is nearly always true in surveys which 
have multiple objectives, the optimum values of 
m and n vary for the different variables. 
For the statistics proposed for this survey, the 
values ranged from 57PSU*s and 105 sample 
persons per PSU for estimating dialDCtes to 95 
PSU's and 35 sample persons per PSU for esti- 
mating peripheral vascular disease, -Tbe choice 
of a best design was not possible because all 
variables were of equal importance, and there- 
fore a compromise had to be made. If pre- 
cision and budget were the only factors to in- 
fluence the choice of a best design, possibly the 
choice would be about 75 PSU's and 64 sample 
persons per PSU since the optimum for eight 
of the variables requires 75-95 PSU's and a 
similar number requires less than 75 PSU's, 
Sample designing is not that simple, however. 
The best design is also a function of things other 
than sampling error, such as availability of 
per:^onnel and equipment and procedures which 
minimize measurement errors. 

For the Health Examination Survey, an item 
of considerable importance and concern is non- 
response. It was learned in Cycle 1 that a high 
cooperation rate can be expected, however, if 
one is willing to make several callbacks to find 
the family at home and to set a time for the ex- 
amination that is convenient i'or the sample per- 
son. To accomplish this requires that the exam- 
ining team remain in the area at least 2 or 3 
weeks. Another important factor which influenced 
the choice between design alternatives was the 
need to minimize the loss of effective time re- 
sulting from moving trom one location Co an- 
other, ^rhus thej^3 is a limit to the number of 
PSU*s that can he completed with available re- 
sources and time limitations. 

As seen in table A, for a 40 PSU design 
it is possible to examine 180 persons per PSU, 
or a total of 7,200 persons, for about the same 



Table A. Comparison of a 40 PSU design with ralnltnura variance, fixed-cost optlrnurn designs for 14 health 
statistics collected In Cycle I of the Health Examination Survey 



Health statistics 
In Cycle 1 



High blood pressure 

Organic heart disease 

Peripheral vascular 
disease-- — — - — 



Arthritis 

Visual acuity 

Edentulous persons- 



Weight greater than 
average — - — ------- 



Diabetes---- 
Headaches--- 
Nose bleeds- 
Tinnitus— 
Dizziness--- 



Orthopnea--- 
Chest pains- 



Proportion 
of popula- 
tion with 
character- 
istic 



.168 
,084 

.105 
,215 
.278 
.169 

,605 

.017 
.743 
.113 
.327 
.431 
.076 
.310 



Within 
PSU 
var- 
iance 



-2 



.135 
.076 

.089 
,162 
,198 
,136 

,235 

.017 
.191 
,100 
.220 
.245 
.067 
.214 



Between 
PSU 
var- 
iance 



.00468 
.00147 

.00514 
.00664 
.00297 
,00439 

,00411 

,00011 
.00180 
,00093 
, 00248 
.00658 
.00095 
,00423 



Optlniuin design 



Number 

of 
PSU's 

im ) 



87 
77 

95 
90 
72 
86 

75 

57 
64 
64 
67 
83 
71 
77 



Number 

of 
persons 
per -psU 

(n ) 



45 
60 

35 
41 
69 
46 

64 

105 
87 
87 
79 
51 
71 
60 



Sampling 



.0095 
.0060 

.0090 
.0110 
.0090 
,0090 

,0100 

,0015 
.0080 
,0055 
.0090 
,0115 
,0050 
.0100 



Selected design 



Number 

of 
PSU's 



40 
40 

40 
40 
40 
40 

40 

40 
40 
40 
40 
40 
40 
40 



Number 

of 
persons 
per PSU 

(n) 



180 
180 

180 
180 
180 
180 

180 

180 
180 
180 
180 
180 
180 
180 



Sampling 
error 



,0115 
.0070 

,0120 
.0135 
.0100 
.0115 

.0115 

.0020 
.0085 
.0060 
.0095 
,0140 
.0055 
,0115 



cost as that for the opcimum designs indicated 
in Che table. Although the sampling errors are 
larger for a 40 PSU design than for the corre- 
J sponding optimum designj for most practical con- 
sideratiOBs in using the results of the sur- 
vey the 40 PSU design and the optimum design 
can be viewed as having about the same reli- 
ability. Therefore, when all factors were con- 
sidered, the 40 PSU design was chosen as best 
under prevailing circumstances. 

FIRST-STAGE DESIGN AND 
SELECTION OF PSU's 

General 

A major and often expensive task in de- 
signing and implementing a national population 
survey, is to establish i>nd maintain a sampling 



frame containing the target populationj to order 
the population in such a way that facilitates 
sample design efficiency, and to select the sam- 
ple units. Fortunate ly, much of this work had 
already been done' as part of the U.S. Bureau of 
the Census Current Population Survey (CPS) 
and the Health Interview Survey (HIS). For these 
purposes, the 3,103 counties and independent 
cities which comfyose the total land area of the 
United States had been combined into 1,891 pri- 
mary sampling units and had been further strat- 
ified into 357 honiogeneous classes or strata. 
The first-stage sample units for both CPS and 
HIS (at the time of designing Cycle II) contained 
357 PSU's, one from each stratum. 

To implement these surveys the Bureau of 
the Census maintains a trained 'field staff of 
several hundred people located in 12 regional 
offices. The Bureau also maintains a continuing 
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program for keeping the sampling frame cur- 
rent through the collection of building permits 
issued in the sample PSU's. Thus design effi- 
ciency was significantly enhanced by taking ad- 
vantage of the Bureau resources in designing 
Cycle II. 

The Cycle II sample of PSU's consists of 
40 of the 357 HIS PSU's. It is not a subsample 
in Che usual sense of the word, however. The 
characteristics of the 357 HIS sample PSU's 
were used as a matter of convenience, to col- 
lapse the 357 HIS strata into 40 HES super- 
strata. Then by use of controlled selection, one 
HIS stratum, referred to subsequently as "first- 
stage units" or FSU's, was selected from each 
superstratum with probability proportional to 
the size of the first-s',ag,e unit. Finally, the sam- 
ple PSU that originally represented the HIS 
stratum was chosen for the Cycle II sample. 

Although detailed descriptions of the HIS and 
CPS sample designs have been published,^^ * "^a 
brief summary of how the PSU's were formed 
and stratified is presented in appendix II to' fa- 
cilitate understanding of the full design of Cycle 

In this section the procedures for forming 
superstrata and for selecting first- stage units 
from superstrata are discussed. 

Formation of HES Superstrata 

To understand how superstrata were formed 
it is useful to view all of the PSU's in an HIS 
stratum as a single unit. In this report, these 
units are called first-stage units since the first 
stage of sample selection in Cycle li was of FSU's. 
The first step in the Cycle II design was to 
stratify the 357 FSU's into 40 superstrata on the 
basis of the characteristics of the HIS sample 
PSU's. This was done in a manner which max- 
imized the degree, of homogeneity within super- 
strata with respect to FSU'population size, geo- 
graphic proximity, degree of Industrialization, 
and degree of urbanization. Stratification was 
carried out within 16 mutually exclusive cells 
formed by classifying the FSU's into four popu- 
lation density classes within each of four geo- 
graphic regions of the United States. 

Other features of the design which had an 
influence on how the superstrata were to be formed 



included the need to produce self-weighting esti- 
mates, to produce estimates for each of the 
four regions, and to have a sample of approxi- 
mately the\ same size for each PSU. The im- 
plications of these conditions on design effi- 
ciency are that the regions should be about the 
same size, each region should contain about the 
same number of strata, and each stratum should 
contain about the same number of people. This 
type of balance was achieved by creating 10 
superstrata in each region with the condition 
that each of the population density classes 
(largest standard metropolitan statistical areas 
(SMSA's), other large SMSA's, other SMSA's 
and highly urban counties, and rural and other 
urban areas) would also contain 10 superstrata. 

To create regions containing about the same 
number of people^ it was necessary to redefine 
the commonly used Bureau of the Census re- 
gional boundries, A comparison of the two def- 
initions is shown in figure 1. 

The four geographic regions are: 

Northeastern — identical to the Census-defined ' 
Northeast Region. 

Midwestern — Census-defined North Central 
Region less Kansas, Nebraska, 
North Dakota, and South Dakota. 

Southern Census-defined South Region less 

Oklahoma and Texas. 

Western Census-defined West Region plus 

those parts detached from the 
North Central and South Regions. 

Figure 1 is somewhat misleading, however, 
in that the actual content of the Cycle II regions 
does not follow the State lines in all instances. 
This is the result of assigning FSU's to regions 
according to the State within which the sample 
PSU in the HIS design was located. Some strata 
in the Western Region contain PSU's actually 
located in the Midwestern and Southern Regions. 
Similarly, some strata in the Midwestern and 
Southern Regions include PSU's located in the 
Western Region. The problem is not serious, 
however, since only a very small proportion of 
a region* s population is involved in the overlap. 

The four population density classes, which 
also divide the country into four roughly equal 
parts, were defined on a sliding scale. For ex- 
ample, the Atlanta SMSA in the Southern Region 



with a population of about a million people is 
equated on the scale to Philadelphia in the North- 
eastern Region and to Chicago in the Midwestern 
Region, The reasoning is that Atlanta has a posi- 
tion of economic importance in the Southern 
' Region similar to that of the other two cities 
in their respective regions. The approximate 
population ranges for size classes are shown 
in table B, 

The average size of superstrata and the 
distribution of FSU*s and superstrata by geo- 
graphic region and population density class are 
shown in table C, 

Note that each density class within a region 
was represented by either two or three super- 
strata and that the average size of the super- 
strata was around 4,5 million people. 

Seven of the superstrata were self- repre- 
senting. That is, each contained a single FSU, 



The New York SMSA was split to form two 
superstrata, as was Los Angeles, The others 
were Detroit, Philadelphia, and Chicago SMSA's. 

The non-self- representing superstrata were 
formed by grouping two or more FSU*s, To the 
extent possible the FSU*s in a superstratum were 
similar in size, as well as in other character- 
istics mentioned above. 

In the highest two population density classes, 
the FSU's tended also to be self-representing. 
In general these were SMSA's of more than 
500,000 people, as indicated in table C. Super- 
strata composed of "other SMSA^s and highly 
urban counties" in the Northeastern and Western 
Regions were, in the most part, made up of 
self-representing FSU*s as shown in table C, 
Contrastingly, all FSU*s in the Southern [Region 
and 85 percent in the Midwestern Region were 
non- self- representing. 



Table B. Definition of population density classes within geographic regions 



Geographic region 



Northeas tern- 



Midwestern - 



Southem- 



Westem- 



Population density classes 



Largest 
SMSA»s 



SMSA^s with 
more than 
3 million 
people 



SMSA's with 
more than 
3 million 
people 



SMSA's with 
more than 
700,000 
people 



SMSA's with 
more than 
1,100,000 
people 



Other large 
SMSA' s 



SMSA's with 
1-2 million 
people 



SMSA's with 
500,000- 
2 million 
people 



Other 
SMSA's 



SMSA's with 
500,000- 
1,100,000 
people 



Other SMSA's 
and highly 
urban counties 



SMSA's with 
less than 
1 million 
people 



Other SMSA's 
and highly 
urban counties 
with less than 
500,000 people 



Non -SMSA, 

highly urban 
areas 



Other SMSA's 



Rural 
and other 
urban areas 



All rural 
and other 
urban areas 



All irural 
and other 
urban areas 



All rural 
and other 
urban areas 



All rural 
and other 
urban areas 
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Table C. Distribution and average size of superstrata and first-stage units by geographic region 
and population density class and v^;hether or not self -representing 



Geo r^'i i region 
.and .p<^?.v\i''!Mtion 
dc tt*i l c;.: class 


Number of superstrata 


Number of FSU's 


Average 
size of 
super- 
strata 


Average 
size of 
FSU's 


Total 


Self- 
repre- 
senting 


Non-self- 
repre- 
senting 


I 

Total 


Self- 
repre- 
senting 


Non-self - 
repre- 
senting 


Urated States 


40 


7 


33 


364 


122 


242 


(In thou 
4,462 


sands) 

492 


Largest SMSA' s 


10 


7 




3 


16 


16 


0 


4,419 


2,762 


Other large SMSA's «- 


10 


0 




10 


64 


54 


10 


4,269 


667 


Other SMSA's and 




















highly urban 






















10 


0 




10 


132 


44 


88 


4,532 


343 


Eurr") and other urban 




















arc*.^vj ------ 


10 


0 




10 


152 


8 


144 


4 , 704 


309 


Nor Lheas tern 






















10 


3 




7 


64 


34 


30 


4,462 


697 


Largest SMSA's 


3 


3 




0 


3 


3 


0 


5,013 


5,013 


Other large SMSA's 

Other SMSA's and 


2 


0 




2 


5 


5 


0 


4,589 


1,836 




















highly urban 






















3 


0 




3 


27 


20 


7 


3,762 


418 


Rural and other urban 






















2 


0 




2 


29 


6 


23 


. 4,558 


314 


Midwestern Region- 


iU 


0 
£. 




8 


88 


0/. 


o4 


4,688 


533 




2 


2 




0 


2 


2 


0 


5,279 


5,279 


Other large SMSA's 

Other SMSA's and 


3 


0 




3 


18 


18 


0 


4,604 


767 




















highly urban 






















2 


0 




2 


27 


4 


23 


4,733 


351 


Rural and other urban 






















3 


0 




3 


41 


• 0 


41 


4,349 


318 


Southern Region 


10 


0 




10 


116 


29 


87 


4,297 


364 


Largest SM^I^'s 


9 


0 




2 


7 


7 


0 


4,024 


1,150 


Other large SMSA's 

Other SMSA^s and 


3 


0 




3 


31 


21 




3,736 


362 




















highly urbaii 






















3 


0 




3 


48 


0 


48 


4,891 


306 


Rural and other urban 






















2 


0 




2 


30 


1 


29 


■ 4,519 


301 


Western Region 


10 


2 




8 


96 


35 


61 


4,476 


466 


Other large SMSA's 

Other SMSA's and 
highly urban 

Rural and other urban 


3 
2 

2 
3 


2 
0 

0 
0 


1 
2 

2 
3 


4 

10 

30 
52 


4 
10 

' 20 
1 


0 
0 

10 
51 


3,514 
4,244 

4,945 
5,280 


2,636 
849 

330 
305 



This total is larger than the 357 strata mentioned in the text. One reason for the differ- 
ence is that several of the HIS self -representing strata were subdivided in designing the Cy- 
cle II sample. In addition,, since the HIS sample was designed, two self-representing PSU's were 
split to form four PSU's, and one very small PSU which was omitted from the frame when the sample 
was drawn originally is designated "self-representing," Thus, there are actually 360 PSU's in 
the HIS design instead of 357. 
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The FSU's in the lowest density class were 
almost entirely non-self-representing. Although 
the average size of these FSU's was more than 
300,000 in each region, each contained a num- 
ber of PSU*s. in fact, about 70 percent of all 
PSU*s were classed as rural or small urban 
areas. These PSU's were quite small, typically 
containing only a few thousand people. 

Two other modes of classification called 
"control classes" were added at the selection 
stage— rate of population change between 1950 
and 1960 and geographic dispersion within re- 
gions, referred to as State groups. 

The explicit use of the rate of population 
change is considered to be a major improve- 
ment in the design. It seems reasonable to 
view the rate of population change as a gross 
economic indicator and, consequently, a valu- 
able health indicator, A depressed area can be 
generally characterized as having a below-aver- 
age population gain and often a loss, whereas 
a new suburban area or new industrial area 
usually shows a large population increase. 



Table D, Definition of rate-of-popula- 
tion-change classes by geographic re- 
gion, 1950-60 



Region 


Rate of population 
change 


a 




7 


5 




Percentage change 


Northeastern : 










SMSA PSU's-- 


<11 


11-20 


21^ 


>21 


Non-SMSA 












<9 


9-16 




>16 


Midwestem---- 


<6 


6-18 


19-25 


>25 




. <5 


5-21 


22-42 


>42 




<14 


14-37 


38-80 


>80 



In the Northeastern Region, the two 
stands making up the New York SMSA con- 
stituted an entire rate-of -population- 
change class, giving it a single-value 
definition, a 21% increase. 



Table E. State groups by geographic region 



Region 


State group 


North- 






eastern 


1, 


nonnor»t""! r»Mt" Ma i n p 






Ma QQar*hiiQi3t"t"Q Mpt»t 






OdlU^ oil J. i. C , IMIUUC ^oXdllU, 






Vermont 




2. 


New York 




'X 

•J . 


ixcw vjcJLocy, t ciiiio y 1. vdiixd 


Midwestern-- 


1 

■L , 


unxo 




9 


j-uu XdUd , rixcnxgdn , 






Wis cons in 




J . 


XI linoio 




A 

H- . 


rixnnesoLd 




J . 


j.(jwd , mssouri. 


Southern 


1. 


Delaware, District of 






Columbia, Maryland, 






Virginia 




2. 


Kentucky, Tennessee , 






West Virginia 




3. 


Alabama, Arkansas, 






Louisiana, Mississippi 




4. 


Georgia, North Carolina, 






South Carolina 




5, 


Florida 




1, 


California 




2. 


Oregon, Washington 




3, 


Texas 




4. 


Arizona, Colorado , 






Idaho Montana , Nevada , 






New Mexico, Oklahoma, 






Utah, Wyoming, Alaska, 






Hawaii 




5, 


Kansas, Nebraska, North 






Dakota, South Dakota 



The rate-of-population-change classes also 
were defined on a sliding scale for each region, 
as indicated in table D, in such a way that each 
class contained approximately one-fourth of a 
region's population in 1960, Rate-of-population- 
change classes were defined slightly differently 
for SMSA*s and non-SMSA*s in the Northeastern 
Region, For other, regions, no distinction was 
made between the two groups,- 

State groups within regions were instituted 
to maximize the spread of the sample among the 
States (table E), The basic criteria for forming 
State groups were to make ihe group member- 
ship as homogeneous as possible with respect to 
socioeconomic characteristics. 
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SELECTION OF FIRST-STAGE UNITS 

In addition to utilizing the fairly extensive 
stratification procedures described in the pre- 
vious section, selection of units at the first stage 
of sampling also incorporated a modification of 
the Goodman-Kish controlled selection technique. 
This procedure permits some element of sub- 
jective determination 'in obtaining a "better bal- 
anced" or "more representative" sample, while 
retaining all tiie elements of true probability 
sampling. In particular, as used in this survey, 
it permitted proportional representation of the 
universe in several classes from each of five 
dimensions of classifications, even though only 
a grand total of 40 PSU's were selected. 

The units sampled at the first stage of the 
HES sampling process were HIS strata. The term 
"first-stage unit" is employed to emphasize that, 
conceptually, the units being sampled were the 
aggregates of all PSU's in an HIS stratum. For 
example, in table F, the HIS sample PSU, Belknap- 
Merrimack, N.H., refers to seven PSU's con- 
stituting a single HIS stratum. This PSU with a 
population of 97,000 was r.he single PSU selected 
from among seven for the Health Interview Sur- 
vey. The first of the FSU's from which a sample 
was selected in HES stratum Dii was the group 
of seven PSU's so referenced. 

Prior to selecting the Cycle II FSU's, strati- 
fication was achieved for four broad population 
density classes within four geographic regions. 
As mentioned previously, this stratification re- 
sulted in a total of 40 HES superstrata— 10 within 
each of the four geographic regions. Deeper strat- 
ification was precluded because of the requirement 
of selection of only one FSU from each super- 
stratum. Had controlled selection not been used, 
and with no other restrictions except sampling 
with probability proportional to size, it would 
have been entirely possible, and indeed not im- 
probable, that almost all the 10 sample PSU's 
in the Northeastern Region would be found to lie 
in the large metropolitan areas of New York, New 
Jersey, and Pennsylvania, with no representation 
at all from less populated areas such as Maine, 
Vermont, and New Hampshire. 

An adaptation ofthe Goodman-Kish controlled 
selection technique was utilized which provided 



for the identification of "control classes," con- 
structed from valuables other than the strat- 
ification variables, which were then used to reduce 
or eliminate such "batching" or extreme clus- 
tering of sample elements. Kish aptly refers to 
this as the introduction of "controls beyond 
stratification," In the preceding example, the 
introduction of such a control would be ilsed to 
increase the probability of inclusion of at least 
one FSU from States such as the three smaller 
ones named above, while maintaining the se- 
lection of one FSU from eacii stratum. 

To the extent that the procedure is skill- 
fully done, sampling variance is reduced. (Re- 
duction is not certain and sampling variance may 
actually be increased,) Algebraic formulation of 
th*e impact of the procedure on sampling vari- 
ances is not possible (or at least cannot be esti- 
mated from sample data from a single survey), 
but it is reflected in the half-sample replicate 
method generally used to estimate variances in 
this survey.-^*-'"' The control of probabilities by 
controlled selection is analogous to the formu- 
lation of balanced orthogonal patterns using 
Graeco- Latin squares familiar in experimental 
design, the major difference arising in increased 
complexity in calculating probability selection 
patterns due to unequal probabilities in the 
strata— control class cells. A good summary ac- 
count of the fundamental concepts of controlled 
selection is given in reference 22, pages 488- 
495, and more detail may be found in the 1950 
original article by L. Kish and R. Goodmanr^^The 
following discussion of the technique will be in 
the context of its application to the selection of 
the 10 FSU's for the Northeastern Region of the 
United States, 

Classification of the 64 PSU's in the North- 
eastern Region into 10 superstrata on the basis 
of population density and size of FSU has been 
previously described. The superstrata are des- 
ignated as Ai.Aii. Aiii, Bi, Bii. Ci, Cii. Ciii, Di, 
and Dii, where A indicates highest population 
density class, D the lowest, and i denotes the 
largest FSU-size class (iii the lowest). 

Control classes were next defined using two 
additional variables— States group and rate of 
population change (from 1950-60). These classi- 
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fications for the Northeastern Region were as 
follows: 



State 
group 


Composition 


(1) 


Maine, New Hampshire, 




Vermont, Massachusetts, 




Rhode Island, Connecticut 


(2) 


^ New York 


(3) 


New Jersey, Pennsylvania 



Rate of 
population 
change 


Definition 


SMSA 
PSU's 


I 

Other 
PSU's 


a 


Under 11% 


Under 9% 




11-20% 


9-16% 


y 


21% 


(empty cell) 


3 


22% and over 


17% and over 



Table F. Expected numbers of first-stage units and related data — HES superstratum 

Dii, Northeastern Region, Cycle II 



State 


Rate of 
lation c 


popu- 

hange 


HIS sample PSU (each represent- 
ing one HIS stratum) 


1960 Census of 
Population 


Expected 
number 


group 


Class 


Percent 


Identification 


HIS 
stratum 


Control 
class 


1 


or 




B 


Belknap -Merrimack J N,H, 
Kennebec -Lincoln, Maine 


471,000 
464,000 


935,000 


.19 




a 




37 
30 
28 
21 
17 


Fairfield -Litchfield, Conn. 
Middlesex-New Haven, Conn, 
Hartford-Tolland, Conn, 
Bristol -Norfolk-Plymouth, Mass, 
Kent-Newport -Providence -Wash- 
ington, R,I, . 


185,000 
418,000 
308,000 
455,000 

373,000 


|i, 739,000 


.36 


2 


a 




6 


Chautauqua, N.Y, 


338,000 


338,000 


.07 








15 


Chemung, Tioga -Tompkins , N.Y, 


321,000 


321,000 


.07 




5 




25 


Orange -Putnam , N.Y, 


265,000 


265,000 


.05 


3 


a 


1 


7 


Lycoming, Pa, 

Lebanon- Schuylki 11 , Pa , 


256,000 
264,000 


520,000 


.11 




/3 




13 


Mercer, Pa, 


127,000 


127,000 


.02 




6 


1 


56 
23 


Monmouth-Ocean , N, J, 
Cumberland-Cape May, N.J. 


443,000 
155,000 


1 598,000 


.13 




4>843,000 1 


1.00 





ERIC 



These variables define the controls beyond 
stratification which will relate to each stratum. 
Since one FSU is to be drawn from each super- 
stratum and since the selection is to be with 
probability proportional to population size, the 
first step in the procedure is to determine ex- 
pected numbers of FSU's in each control group 
by relating the populations of the control groups 
to a proportionate base of one FSU, For Cycle 
II, table F shows data for superstratum" Dii, 

Corresponding calculations for each super- 
stratum result in expected numbers of FSU's 
for the full table of superstrata by control 
classes. These form the basic selection matrix 
for controlled selection analogous to the Graeco- 
Latin square,. The full matrix for Cycle II data 
is shown in table G, 



Values in the selection matrix show the ex- 
pected numbers of sample FSU's which will be 
selected within any given cell. If the expected 
number is 1,0, exactly one FSU corresponding to 
that cell will be selected, and if the expected 
value is exactly zero, there will be no sample 
FSU's corresponding to that cell. If the expected 
number is 0,m, the probability is 0,m that one 
FSU corresponding to that cell will be selected, 
and 1-0, m that no sample FSU*s con*esponding 
to that cell will be selected, 

Ttie marginal row totals ensure that ex- 
actly one FSU will be selected from each super- 
stratum, and the marginal column totals reflect 
the control beyond stratification of the control 
classes. For example, for control class 0 (3) 
the probability is .78 that two FSU's will be se- 



Table G, Selection matrix for Northeastern Region, Cycle II, rate-of-population- 

change class and State group 



HES super- 
stratum 



Total 



..31- 



1,00 



.41 



,52 



.08 
.30 



■^11 
.19 — 



.19 
.07- 



.48 

.16 

.17- 

.27- 

.31- 
.11. 



...22 4-. 17 
..17 
.04 



1.20 — .26- 1.50 



.21- 
-. 14 



.11--, 



.06 

.07. 



.02 



.54- 



.61.-1.78 



1.00 
1.00 



2.00 



.14- 



.22 

.25 



.31. 
-.15 



.36 — 



.09- 
.05 



.28- 



■^13- 
-.13- 



.97- .45 — .69 



—1.00 

— 1.00 

— 1,00 

— 1,00 

— 1,00 

1,00 

— 1,00 

— 1,00 

— 1.00 

— 1.00 

-10,00 
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lected from this class and ,22 that only one will 
be selected. U is impossible that this control 
class will noi be represented by any sample FSU. 

Next, a set of selection patterns is developed 
which meet the requirements of the probabi- 
listic restrictions of the selection matrix, L'he 
process is conveniently illustrated by a small, 
hypothetical case of two strata and two control 
classes. 



Stratum 


Control 


c lass 


Total 


1 


2 


I 


.4 


.6 


1.0 


II 


.7 


.3 


1.0 


Total 


l.L 


.9 


2.0 



The marginal limitations of the columns 
in this case imply that patterns may be formed 
by selecting 1 or 2 elements from class 1 and 
0 or 1 element from class 2, Thus, 3 patterns 
are possible, namely: 



Stratum 


Control 
class 


Pattern 


1 


2 


3 


I 


1 


1 


1 


0 




2 


0 


0 


1 


II 


1 


0 


1 


1 




2 


1 


0 


0 



Calculation of the probabilities of occurrence 
of these patterns involves solving the following 



equations in which P^ is the probability as- 
socicued with the /^^ pattern. 



1 


1 


0 




p 
M 




4 


0 


0 


1 








.6 


0 


1 


1 








,7 


1 


0 


0 








.3 


1 


1 


1 








1,0 


_ 















The last row of the coefficient matrix re- 
flects the requirement that the sum of the prob- 
abilities of all patterns equals 1. For this sim- 
ple and very restricted example there is a unique 
solution since the rank of the coefficient matrix 
equals three. However, in more complicated 
cases, the solution is usually not unique, and the 
judgmental decisions made in choosing patterns 
with nonzero probabilities influence the effec- 
tiveness of the procedure in achieving reduction 
of sampling variability. As the number of control 
classes and strata increase, the complexity of 
forming the selection patterns and calculating 
their associated probabilities increases rapidly, 
Kish has presented a method of forming se- 
lection patterns by successive subtraction of 
cell probabilities,--'-^ andSchnack has developed 
a computer routine whereby sets of patterns may 
be generated and the resulting equations may be 
solved for the associated probabilities,-' (There 
is no unique scheme which is favored by all, or 
even a majority, of samplers.) 

For the Northeastern Region, a set of 17 
patterns formed a complete set; that is, a set with 
associated probabilities totaling 1, The first six 
of these are indicated in table H, A single pattern 
is next chosen with probability proportional to 
the probability of occurrence of the pattern by 
selecting a random number between 0 and i. For 
the data shown abovo the random number was 
.34 and pattern 3 was used in the survey. 

A final selection is necessary for those cells 
in the pattern which contain more than one FSU, 
For example, table H shows that one FSU is 
to be selected in stratum Dii, control class 16, 
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Table H» Partial 



coefficient matrix of the first six of 17 selection patterns, HES 
Cycle II, Northeastern Region 



HES supers tratum 



Rate 
State i 

; change 



Pattern 



Ai--. 

Aii-. 

Aiii- 

Bi--- 

Bii- 

Ci--- 



Cii- 



Ciii- 



Di- 



Dii- 



8 
5 

6 
a 



6 
a 

a 

6 



1 
1 
1 

1 
0 
0 

0* 

1 

0 

1 

0 
0 
0 

0 
0 
0 

1 

0 
0 

0 
0 

1 

0 
0 

0 

1 

0 
0 
0 
0 
0 

0 
0 
0 
0 
0 

1 

0 
0 



Probability of pattern- 
Cumulative probability- 



.17 
.17 



.14 

,31 



.07 
38 



.13 
.51 



06 
57 



.11 
.68 
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Table F shows the five FSU's constituting this 
cell. The final sampling operation selects one 
of the five with probability proportional to popu- 
lation size. If v/e denote x, 



.'th 



as the size of the 
5'*^ HES stratum 



J - control group in the 
and add a third subscript for the p'*^ FSU within 
this cell, this final selection is with probability 



Further if we denote 

Pj ^probability of selection of the 
cell 

-probability of selection of the 
pattern 

/th 



./th 



'th 



=population size of the s'"' super- 
stratum 

Pj^j =/>^ for all patterns which include j 
= 0 for all other patterns, 



Py^^ , Howeyfer, since the orig- 



clearly P^ = 
inal cell probabilities in table Q are consistent 

with the />x * it is true that 

Thus the final probability of selection of each 
X 

sample FSU was — HiB . That is, within each 
Xi 

stratum, the probability of selection of each FSU 
was proportional to its population size, this sam- 
pling procedure having beenmaintained while pro- 
viding the controls beyond stratification to reduce 
the probability of highly unrepresentative com- 
binations and, hence, to achieve a reduction in 
sampling variability. The FSU, or HIS stratum, 
having been thus selected, the PSU previously 
selected to represent the HIS stratum, was then 
selected with probability 1 for purposes of the 
Health Examination Cycle II Survey, However, 
the actual probability of selecting the PSU from 
an FSU was proportional to the size of the FSU. 
Consequently, the probability of selecting a PSU 



was 
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WITHIN PSU DESIGN 

Problems of Development 

A first-stage sample of 40 PSU's and the 
use of two mobile examining centers would per- 
mit the examination of about 180 children in each 
sample PSU, or a total of about 7,200 examinees 



over a 2-year period. The within PSU design 
focused on the problem of selecting a probability 
sample of 8,000 children aged 6 to 11, or 200 in 
each sample PSU under the assumption that 90 
percent of the children would be examined. 

In developing the within PSU design, several 
problems had to be considered. The first was 
how to construct the universe, or sampling frame, 
to assure that every person in tte target popu- 
lation has a chance of being selected in the sam- 
ple. Secondly, there was some concern during 
the early stages of planning that parents would 
be reluctant to let their children travel long 
distances for an examination. One-way distances 
of 20 to 50 miles could be expected frequently, 
and occasionally more than 50 miles, if the sam- 
ple segments were randomly selected throughout 
the PSU's. Thus, for large SMSA's and other 
PSU's covering large geographic areas, an inter- 
mediate stage of selection needed to be developed. 
Other problems to be considered in the within 
PSU design were how to select a sample of seg- 
ments, or clusters of eligible children, and how 
to select a sample of children to be examined. 

Coverage of the Universe 

The problem of selecting a probability sam- 
ple of individuals is necessarily a complex one 
because there is no single best frame from which 
to select the sample and yet ensure complete 
coverage of the universe; in this case, the non- 
institutional population aged 6 through 11 re- 
siding outiside Indian reservations. First, it will 
be useful to consider that the universe can be 
divided into four quadrants shown in the table 
below. The building blocks are the 1960 Census 
Enumeration Districts, which are small, well 
defined areas of about 200 housing units into 
which the entire Nation w^s divided for the 1960 
Census of Population, Each enumeration district 
(ED) can be allocated to one and only one of the 
four quadrants according to a set of rules estab- 
lished by the Bureau of the Census, Enumeration 
districts whose 1960 Census Listing Books contain 
a high proportion of locatable or usable ad- 
dresses are judged to be in either Quadrants A 
or C, Other ED*s. mostly those with R,F.D, route 
addresses, are assigned to Quadrants B or D, 
The assignment to Quadrant A/B or C/D is 
based upon whether or not the ED is in a juris- 
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diction which maintains lists of building permits 
which can sex've as a sampling fi-ame. The 
approximate distribution of CD's, and conse- 
quently any sample of households, among the four 
quadrants is shown in parentheses in the fol- 
lowing table: 







Not- 






Usable 


usable 


Both 




addresses 


addresses 


types 


All areas - 


(0.67) 


(0.33) 


(1.00) 


Building 


A 


C 




permit areas-- 


(0.57) 


(0.28) 


(0.85) 


Nonpermit 


B 


D 






(0.10) 


(0.05) 


(0J5) 



The total universe of children eligible for 
the Health Examination Survey can be divided 
into the following four subuniverses; 

I. Eligible children residing inhousingunits 
listed in the 1960 census in ED's defined 
as having usable addresses, 
1[. Eligible children residing inhousing units 
listed in the 1960 census in ED's defined 
as not having usable addresses. 

III, Eligible children residing in housing units 
missed in the 1960 census, 

IV. Eligible children residing in housing units 
built since the 1960 census, 

A PSU can, and usually does, contain ED's 
in each of the four quadrants. Furthermore, 
ED's generally contain children from three sub- 
universes, either. 1^ HI, and IV or II, III, and IV, 
Note that Subuniverses I and II are mutually 
exclusive. Subject to soyne possible errors in the 
application of the methods, coverage was made of 
the total universe by the following methods: 

1, Quadrant yl,— Subunlverse I was repre- 
sented in the survey by a sample of clusters 
of addresses called list segments from 1960 
Census Listing Books, Subuniverse III was given 
representation by a sample of "supplemental 
blocks." Supplemental blocks are chunks of land 
area, often a city block. For the Health Exam- 
ination Survey, one supplemental block of about 
24 housing units was selected for every three 
list segments selected. A map of each supple- 



mental block and the 1960 Census Listing Book 
for Che ED from which each was drawn were 
given to an interviewer for listing about 2 months 
prior to the initial interview date for the sample 
PSU, Any housing units in the supplemental blocks 
built prior to April 1960 and not listed in the 1960 
Census Listing Book for their ED^s were added 
10 the sample under the assumption that they had 
been missed in the census. Subuniverse IV re- 
ceived coverage from a sample of building per- 
mits issued since April 1960. 

2. Quadrant JB.— llie methods used to ensure 
coverage for ED's in Quadrant B differ from 
Quadrant A only in that both Subuniverses III and 
IV were given representation by the sample of 
supplemental blocks. This was accomplished by 
including in the sample all housing units not in 
the 1960 Census Listing Bookj not only those 
built before April 1960. 

3., Quadrant C, — Representation was given 
to Subuniverses II and III by a sample of small 
area segments selected fi'om ED's defined as 
not having usable addresses and Co Subuniverse 
IV by a sample of building permits issued after 
April 1960, Any overlap between Che two fi'ames 
was resolved by an inquiry into che date of con- 
struction of housing units in sample area seg- 
ments and a deletion of any constructed after 
April 1960. 

4. Q24adrant D.-^Finally, since no building 
permit data were available for ED's of Quadrant 
D, the area segments provided coverage for 
Subuniverse IV as well as for Subuniverses II 
and 111, Since only about 5 percent of the sam- 
ple is drawn fi'om ED's in Quadrant D and there 
is probably little new construction In these pre- 
dominantly rural areas, it is unlikely that thez'e 
would be any sizable contribution to the mean 
square error arising from this quadrant. 

Selection of Localities Within PSU's 

This intermediate stage of selection .was 
consider-ed important in the eai'ly stages of the 
cycle because it minimized the burden on the 
children and their pai'encs by reducing the dis- 
cance that some would have to travel to the ex- 
amining center. If long distance travel should 
be a problem, then the selection of localities 
within sample PSU's should tend to maximize 
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the response rate and also reduce the cost of 
the survey. 

The basic axis in the definition of a local- 
ity was in terms of Census minor civil divisions. 
Thus it was typically a city, part of a city, vil- 
lage, town, county, or the nonurbanized part of 
a county. The ultimate goal for a locality was 
that it should contain ar least 250 children aged 
6-11, or an elementary school enrollment of 
250 children, or an area containing at least 
2,000 people according to the 1960 census. The 
selection of an intermediate stage sample was 
not done routinely, but it was done on a PSU^by- 
PSU basis , after a review of the problem had 
been made by NCHS-Census working committee. 

Intermediate samples were selected for six 
PSU*s only— Ashtabula-Geauga Counties, Ohio; 
Columbia- Dutchess Counties, New York; and the 
Denver, Philadelphia, Los Angeles, and Boston 
SMSA*s. The procedure was discarded after the 
10th stand because it was found that clustering 
sample segments in two or three areas, some- 
times, distant from feasible sites for the exam- 
ination center, created an adverse situation. 
Random sampling of segments with probability 
proportional to size without an intermediate stage 
of sampling concentrated the sample around popu- 
lation centers where feasible examination centers 
could be located. Furthermore, there was little 
if any evidence that distance from a sample per- 
son's home to the examining site affected, the 
participation rate or that mothers were reluctant 
to have their children travel so far. Also, any 
reduction in cost that accrued by sampling loca- 
tions was more than compensated for by in- 
creased design efficiency resulting from the 
elimination of a stage of sampling. 

For those six PSLJ's where locality sam- 
pling wan used, after division into localities, a 
sample of three was drawn with a probability 
proportional to their 1960 population of children 
6-9 years of age. In an SMSA one of the local- 
ities was the central city of the SMSA, and it 
was selected with certainty. From the remaining 
localities, which numbered from four to nine for 
the PSLJ's subsampled, two others were selected. 

In four of the six PSLJ's subsampled, the 
Lahari sampling techniquis was used. The 
method may be described briefly as follows: 



1. Let the localities in a PSU be represent- 
ed by L^, . . , , L ^ . . . . , L ^ , which have 

measures of size i4 i4. , , . , 4 a -the 

total number of children ^-9 years ol* age in 
each locality according to the 1900 census. 

2. Let A^he a numl^er noi smaller than the 
sum of /n largest- measures of size in the PSU. 

3. Select, without replacement, a simple 
random sample of m localities. 

4. Choose a random number in the inter- 
val 1<R^<A^ 

m 

5. If R< 2: A:, use the sample of size m 

1 j = i J 

selected in (3). If not," repeat the procedure until 
the condition is satisfied. 

Because of the desire to control on the geo- 
graphic spread of the sample in the Los y\ngeles 
and Philadelpiiia SMSA*s, controlled selection 
of localities was used. The procedure will not be 
described since it is basically the same as that 
described above for the selection of first-stage 
units. However, it may be instructive to know 
how the populations were classified prior to 
sample selection. 

The Philadelphia SMSA extends over the city 
of Philadelphia and three counties in New Jersey 
and four counties in Pennsylvania. The sampling 
plan was to select Philadelphia with certainty 
and two of the counties with a probability pro- 
portional to their I960 population. To maximize 
the representativeness of the sample with re- 
spect to its urban-rural characteristics and geo- 
graphic spread, the Ci>jnties were grouped into 
a control selection matrix according to three 
degrees of urbanization classes (over 90%, 70- 
8'0%, and less than 70% urban) and State of loca- 
tion. Four controlled selection patterns of two 
counties each were formed. Then one of the four 
patterns- was chosen by a random procedure. 

The population, of the Los Angeles SMSA was 
greater thafi the maximum size that had been set 
for a single stand (5 million) but was smaller 
than the minimum size of a double stand (8 
million). This was also true of the Chicago 
SMSA, which had a 1960 population slightly be- 
low that of Los Angeles. To achieve a balance 
for the two areas it was decided to select 32 
segments from rhe Los Angeles SMSA and 28 
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from Chicago/ which, when combined, was the 
equivalent of three l-fi:S stands. 

The Los Angeles SMSA was divided inio 
the city of Los Angeles and four other sirata 
of approximately 1 million people eoch. This 
stratification was accomplished by ranking the 
Census Minor Civil Divisions by iheij* 1960 
population size and dividing the total jnio qiiar- 
tiles. In addition to the city, one census division 
was drawn from each of the strata, controlling 
on geography and four population density- income 
classes. 

Selection of Segments 

As stated in the section on coverage of 
the universe, there were four types of segments. 
List and area types of segments were the princi- 
pal ones since they covered the vast majority 
of the target population. The other two, permit 
and supplemental block segments, were quite 
small since they included only eligible children 
residing in housing units built since the 1960 
census and those residing in housing units missed 
in the 1960 census. 

With only three exceptions, 20 segments 
were selected within each sample PSU from the 
frame of I960 ED's contained in the sample 
PSU. Commonly, this sample of segments con- 
sisted of a combination of the two types depend- 
ing on the character of the Census Listing 
Book of Addresses (usable or not) in the ED 
from which the segment was selected. In ad- 
dition about three permit and supplemental block 
segments per PSU were selected, averaging 
about 1,5 eligible children per segment. 

The area and list segments contained an 
expected nine children aged 5 to 9 in 1960 or 
about 11 children aged 6 to 11 at the time of 
the survey. Since the number of eligible chil- 
dren in a housing unit was a variable, there 
was a chance that' 20 segments (plus the per- 
mit and supplemental block segments) would 
not yield the desired minimum sample of about 
180 children. To overcome this potentiality, 
two reserve segments were selected, in ad- 
dition to the 20, for the first eight stands. It 
became apparent at that time that 20 segments 
were sufficient, and therefore the selection of 
reserve segments was discontinued. The experi- 
ence using this procedure was on the whole 



satisfactory as indicated in table , wliich shows 
the numbers of segments, interviev.'ed hou.^ing 
units, and eligible children in the sample, by 
PSU and tyiv of segment. 

The sample of segment.^ w\is selected in 
two steps, I'irst, ench I'.D was assigned a meas- 
ure of size equal to a rounded whole number 
resulting from a division by of the number of 
children aged 5 to in the LJ) at die time of die 
^"^60 census, I'hen a sample of 2(1 I- i"/s was 
s<?lected (except for the first eight stands when 
22 were chosen) with probabilities proportional 
to ti e measures of size assigned to tlie hlD's. 
Each sample ED was subsequently divided into 
as many roughly equal-sized segments, either 
area or list segments, as there were measures 
of size. The final step in the process was a 
random selection of one segment from each 
enumeration district. The selection procedure 
may be illustrated by a hypothetical example. 

In the I960 Census of Population, suppose 
a PSU was divided into 500 enumeration dis- 
tricts containing an average of about 200 housing 
units each, 1 he addresses of the housing units 
were often not well-defined street numbers, so 
"area segments'' were selected from this I^SU. 

For each ED the number of children aged 
5 to 9 was determined as shown in the following 
table. Also shown are the appropriate '"measures 
of size** resulting by dividing by 9 the number 
of children aged 5 to 9 in each Ei:"), and the ac- 
cumulation of measures 'of size over the entire 
PSU, 



ED 
number 


Number 
of 

children 
aged 
5 to 9 


HES 
measure 
of size 


Accumulative 
measure 
of size 


1 


25 


3 


3 


2 


37 


4 


7 


3 


20 


2 


9 


4 


64 


7 


16 


5 


15 


2 


18 


* 






\ 


499 


40 


4 


1,647 


500 


30 


3 


1,650 


Total 


15,000 


1,650 


1,650 
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Table J, Numbers of segments, interviewed housing units, and eligible children in the 

sample, by PSU and type of segment 



PSU 



Total- 

1 

2 

3 

4 

5 ---- 

6 

7 

8- 

9 

10 

11 

12- — 

13 — - 

14 

15 

16 

17 

18 • 

19 

20- 

21 — . 

22 

23 

24 

25- - 

26 

27 • 

28 • 

29- ■ 

30 

31- 

32- • 

33- ■ 

34 ■ 

35 - 

36 

37 • 

38 — < 

39- • 

40 • 



Total 



List and area segments 



Se2- 
men t s 


Inter- 
viewee d 
hous - 
ing 


Eligi- 

ble 
chil- 


See- 
men t s 


Inter- 
viewed 
hous- 
ing 


Eligi- 
ble 
chil- 


See- 
men t s 


Inte'r- 
viewed 
hous- 
ing 


Eligi- 
ble 
chil- 


d r en 




dren 




dren 




units 




units 




units 


954 


21,393 


8,589 


820 


20,928 


8,382 


134 


465 


207 


28 


630 


200 


22 


615 


195 


6 


15 


5 


25 


475 


246 


22 


469 


241 


3 


6 


5 


26 


638 


248 


22 


628 


239 


4 


10 


9 


23 


602 


218 


22 


602 


218 


1 


0 


0 


25 


600 


230 


22 


592 


227 


3 


8 




25 


459 


206 


22 


448 


204 


3 


11 


2 


31 


505 


240 


22 


473 


224 


9 


32 


16 


26 


451 


240 


22 


446 


236 


4 


5 


4 


22 


410 


248 


20 


402 


240 


2 


8 


8 


20 


727 


147 


16 


708 


143 


5 


19 


4 


24 


777 


201 


20 


740 


191 


4 


37 


10 


24 


694 


138 


16 


679 


130 


8 


15 


8 


. 24 


546 


246 


20 


520 


234 


4 


26 


12 


23 


459 


196 


20 


439 


188 


3 


20 


8 


22 


539 


193 


20 


534 


193 


2 


5 


0 


22 


882 


220 


20 


882 


220 


2 ' 


0 


0 


23 


689 


195 


20 


676 


193 


3 


13 


2 


23 


395 


241 


20 


387 


239 


3 


8 


2 


24 


727 


226 


20 


708 


221 


4 


19 


5 


24 


423 


252 


20 


410 


242 


4 


13 


10 


21 


379 


218 


20 


373 


217 


1 


6 


1 


21 


495 


234 


20 


493 


233 


1 


2 


1 


37 


690 


301 


•32 


673 


288 


5 


17 


13 


23 


451 


160 


20 


442 


156 


o 

J 


y 




20 


434 


221 


- 20 


434 


221 


0 


0 


0 


25 


408 


188 


20 


402 


183 


2 


6 


5 


22 


338 


186 


20 


330 


134 


2 


8 


2 


22 


267 


179 


20 


263 


i-;6 


2 


4 


3 


25 


528 


239 


20 


507 


231 


5 


21 


8 


23 


421 


149 


20 


400 


139 


3 


21 


10 


24 


450 


216 


20 


437 


207 


4 


13 


9 


24 


506 


250 


20 


498 


248 


4 


8 


2 


25 


650 


260 


20 


626 


247 


5 


24 


13 


20 


422 


239 


20 


422 


239 


0 


0 


0 


23 


680 


231 


20 


675 


226 


3 


5 • 


5 


24 


492 


218 


20 


478 


209 


4 


14 


9 


22 


596 


222 


20 


589 


220 


4 


7 


2 


26 


616 


228 


20 


595 


226 


6 


21 


2 


22 


545 


163 


20 


540 


159 


2 


5 


4 


21 


397 


156 


20 


393 


155 


1 


4 


1 



Permit and supple- 
mental block segments 
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To determine a sample of 20 ED's, divide 
1,650 by 20 to' get the length of the sampling 
interval (82.5), Select a random number between 
0,1 and 82,5, say 13,0, This chooses ED number 
4, The remaining 19 ED's are determined by 
adding this random number to the sampling in- 
terval and accumulating until the total exceeds 
1,650 measures Oi size. 

One segment is then selected from each 
sample ED, ED number 4 has a measure of 
size of 7; that is, it contains 7 segments with an 
average number of 11 children expected in each. 
Since these are area segments, it is necessary 
to identify their boundaries and approximate num- 
bers of housing units contained within the seg- 
ments. After the boundaries of the 7 segments 
have been determined, one of them is chosen, 
each having an equal probability of selection. 

Selection of Sample Children 

The next step in the sample design was to 
identify the sample of children who were eli- 
gible to participate in the health examination. 
At each of the sample households a Census in- 
terviewer made a visit and asked certain ques- 
tions. The questionnaire used is shown in ap- 
pendix III, 

The front of the questionnaire is concerned 
primarily with standard Census identification 
entries related to the housing unit. On the inside, 
the first group of questions that was asked iden- 
tified the household composition. If there were no 
eligible children in the household, the interview 
was concluded with a few questions related to 
the possible presence of another household on the 
premises. In households in which there were eli- 
gible children, the remainder of the questionnaire 
was completed, A more detailed report of the ad- 
ministration of this questionnaire as well as the 
general plan, operation, and response results of 
the survey has been published,^^ 

The 954 segments in the sample yielded a 
total of 25,106 households. Of these, 21.393 were 
interviewed, 2,291 were found to be vacant or to 
belong to persons having a usual residence else- 
where, z'lid at 22 the composition of the house- 
hold could not be established because of re- 
fusals or no one was home despite repeated calls. 



In addition to the households identified above, 
798 of the expected housing units in the original 
Census listing were found to have been demolished, 
outside segment boundaries, converted to busi- 
ness or storage, or merged with another unit. 
The households interviewed yielded a total 
of 8,589 eligible children. The distribution of the 
number of segments, interviewed housing units, 
and eligible children for each PSU is shown in 
table J, 

There was, however, a limit on the number 
of children that could be examined at a partic- 
ular PSU, The time available for examinations 
at a particular PSU was necessarily set far in 
advance of any preliminary fieldwork. Therefore, 
the number of examinations that could be per- 
formed was dependent upon the number of ex- 
amining days available. At most locations the 
number of days available, excluding Saturdays, 
was 18, The daily schedule of examinations 
called for six children in the morning and six 
in the afternoon so that 216 examining slots were 
available. However, because rescheduling was 
necessary for cancellations or no-shows, the 
maximum number of children who could be ex- 
amined was ai^proximately 200, At 26 locations, 
it was necessary to subsample the eligible chil- 
dren to yield around 190-200 sample children for 
examination. 

Subsampling was accomplished through use 
of a master list which consisted of the names of 
eligible children determined in the household 
interviews. All eligible children in the PSU were 
listed, in order by segment, serial (household 
order within segment), and column number (order 
in the household by age) and then numbered. 
After the desired subsampling rate had been de- 
termined, every n**^ name on the list was de- 
leted, starting with the y*^ name, y being a 
number between 1 and n selected randomly. 
For example, if the total number of eligible 
children was 220, then a subsampling rate of one 
in 10 could be used which would reduce the num- 
ber to 198, Selecting a random number between 
one and 10, say four, then the fourth eligible 
child on the master list would be deleted from 
the sample, as would every 10th following child, 
e,g,, numbers 14, 24, 34, and 44, 



ERIC 



21 



ESTIMATION PROCEDURE 

An examination finding for an individual sam- 
ple child is shown in data tabulations as a 
weighted frequency. This weight is a product of 
the reciprocal of the probability of selecting the 
child, an adjustment for nonresponse (not exam- 
ined), and a poststratified ratio adjustment. The 
last was used to increase precision by bringing 
survey results into closer alignment with known 
U.S. population figures by color and sex within 
single years of ages 6 through 11, 

The sample of slightly more than 7, 400 chil- 
dren was arrived at by three stages of selection. 
The probability of an individual's being selected 
was the product of the probabilities of selection 
at three stages. In the first stage a single PSU 
was selected from each stratum. Within each sam- 
ple PSU, a sample of segments expected to con- 
tain about 11 eligible children was selected. Then 
a subsample of the eligible was selected when the 
number of eligible children exceeded 200 in a 
isample PSU. 

Since the strata are roughly equal in popu- 
lation size and a nearly equal number of sample 
children were examined in each of the sample 
PSU's. the sample design is essentially self- 
weighting with respect to the target population; 
that is, each child 6 to 11 years old has about 
the same probability of being drawn into the 
sample. 



The adjustment for nonresponse is intended 
to minimize the impact of nonresponse on final 
estimates by imputing to nonrespondents the 
characteristics of "similar" respondents, that is, 
by relating nonrespondents to respondents by 
ancillary data known for both, Nonresponse due 
to refusals to be interviewed and "not-at-homes" 
amounted to only 22 households, so that the only 
nonresponse category requiring some adjustment 
was the "failure to be examined" nonresponses 
which amounted to 3,9 percent of the 7,417 sam- 
ple children, "Similar" respondents were judged 
to be children in the same sample PSU having 
the same age (in years) and sex as the children 
not examined in the saniple PSU, The weights of 
all respondents in a PSU having the same age and 
sex were adjusted upward to give representation to 
the nonrespondents in the PSU having that age and 
sex. Table K shows the total number of eligible 
children identified, the number of sample chil- 
dren, and the percent of sample children examined, 
by age and sex. The percent examined was quite 
similar for both boys and girls and for each age 
group. The response rate was also stable for 
each PSU ranging only from 90.6 to 100,0 per- 
cent as shown in table L, 

The poststratified ratio adjustment used in 
the second cycle achieved most of the gains in 
precision which would have been attained if the 
sample had been drawn from a population strat- 
ified by age, color, and sex. The effect is to 



Table K. Number of eligible children in the sample, number selected for examination, 

and percent examined, by age and sex 





Number o 


f 


Number of 


Percent of sample 


Age 


eligible children 


sample children 


children examined 


Total 


Boys 


Girls 


Total 


Boys 


Girls 


Total 


Boys 


Girls 




8,589 


4,368 


4,221 


7,417 


3,765 


3,652 


96.0 


96.5 


95.5 




1,350 


690 


660 


1,161 


596 


565 


95.7 


96.5 


94.2 




1,500 


768 


732 


1,293 


655 


638 


96.0 


96.5 


95.5 




1,492 


754 


738 


1,281 


649 


632 


96.1 


95.2 


97.0 




1,430 


715 


715 


1,231 


618 


613 


96.2 


97.6 


94.8 




1,392 


693 


699 


1,208 


594 


614 


96.0 


97.0 


95.1 




1,425 


748 


677 


1,243 


653 


590 


95.9 


96.2 


95.6 
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Table L. Number of sample children and number and percent examined, by stand number 
and location: Health Examination Survey, 1963-65 



Stand location 



Stand 
number 



Number 

of 
sample 
chil- 
dren 



Examined 



Num- 
ber 



Per- 
cent 



All stands -------«----•-- 

Portland, Maine-- 

Ashtabula, Ohio 

Poughkeepsie J New York 

0 1 tumwa , Iowa ------------------------- 

Boston, Massachusetts--------- --- 

Denver, Colorado---------- --«----, 

Philadelphia, Pennsylvania 

Lamar, Colorado-----------------------' 

Charleston, South Carolina 

Los Angeles, California 

Sarasota, Florida--- 

A t Ian ta , Georgia 

San Francisco, California 

Ba 1 timore , Mary land 

Mariposa, California 

New York, New York 

Moses Lake, Washington 

Minneapolis, Minnesota 

Grand Rapids, Michigan 

Neillsville, Wisconsin 

Chlc;:»Q;o, Illinois 

Des Moines, Iowa---- 

Barbourville , Kentucky — •- 

Wichita, Kansas 

Marked Tree, Arkansas---- 

Brownsville, Texas 

Houston, Texas--- 

Birmingham, Alabama- 

Detroit, Michigan 

Lapeer and Marysville, Michigan--- 

Cleveland, Ohio 

West Liberty and Beattyville, Kentucky' 
Allentown, Pennsylvania--------- --- --- 

Manchester and Bristol, Connecticut 

Newark, New Jersey 

Jersey City, New Jersey 

Georgetown, Delaware 

Columbia, South Carolina 
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1 
2 
3 
4 
5 
6 
7 
8 
9 
12 
11 
13 
14 
15 
16 
19 
18 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32' 
33 
34 
35 
36 
37 
38 
39 
40 



7,417 



7,119 



200 
185 
193 
196 
192 
192 
192 
183 
186 
285 
188 
191 
189 
193 
188 
421 
193 
201 
191 
201 
301 
160 
196 
188 
186 
179 
186 
149 
168 
179 
175 
172 
173 
174 
177 
175 
163 
156 



198 
175 
190 
195 
174 
189 
174 
183 
171 
266 
185 
187 
187 
186 
186 
390 
189 
194 
186 
201 
283 
159 
185 
178 
182 
175 
181 
144 
162 
175 
166 
160 
159 
167 
167 
163 
159 
148 



96.0 



99.0 
94.6 
98.4 
99.5 
90.6 
98.4 
90.6 

100.0 
91.9 
93.0 
98.4 
97.9 
98.9 
96.4 
98.9 
92.6 
97.9 
96.5 
97.4 

100.0 
94.0 
99.4 
94.4 
94.7 
97.8 
97.8 
97.3 
96.6 
96.4 
97.8 
94.9 
93.0 
91.9 
96.0 
94.4 
93.1 
95.5 
94.9 



Cities in which trailers were located. Sample areas consisted of the PSU^s which 
may have included several counties, 

NOTE: Sample "take" for Los Angeles was deliberately somewhat low for "two stand 
locations" because that area should be only slightly over 1-1/2 stands on a population 
basis, Chicago, on the other hand, was oversampled in comparison with other one stand 
locations," since it should be represented by slightly under 1-1/2 stands. 
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make the final sample estimates of population 
agree exactly with independent controls prepared 
by the Bureau of the Census for the U.S. non- 
institutional population at the midsurvey period 
(August 1, 1964) by color and sex for each single 
year of ages 6 through 11. The weights of every 
responding sample child in each of the 24 age, 
color, and sex classes were adjusted upward or 
downward so that the weighted total within the 
class equaled Che independent population control. 
The poststratifed adjustments required are shown 
in table M, 



Table M, Poststratif ied adjustment fac- 
tors (ratio of Census population con- 
trol totals to Cycle II weighted esti- 
mates) 



Age 


White 


All 


other 


Boys 


Girls 


Boys 


Girls 




1.06 


1.08 


1.14 


1.29 




0.92 


0.99 


1.20 


1.01 


8 years------- 


0.95 


1.00 


1.21 


0.82 




1.01 


0.98 


1.20 


. 1.01 




1.00 


0.93 


1.34 


1.14 




0.91 


1.01 


1.01 


0.96 



To aid in understanding the estimation pro- 
cedure, the estimator is presented as follows: 



Consider an x-characteristic of the 



person in the A^^segment, y^" age-sex class, /^^ 
PSU, /i*^ stratum, and the g^^ age-sex-color class 
in the United States, denoted by X ^^^^^ . An esti- 
mator, x' , of a total aggregate, x , in the U.S. 
population is derived from Cycle II data using the 
following equation: 



f sample 



/ 24 ,40 

RL 2 



- W Z W 2 '^^'i 

B-l « h = l l.h 1=1 2. hi 1=1 /, 

' .hi| 

S "hljk 



where R'^^Y^/Y^ = ratio of total U.S. non- 
institutional population in the g^^ age-sex- 
color group according to the 1964 census fig- 
ures to the estimated total U.S. noninstitutional 



population in the g^^ age-sex-color group using 
a simple inflation- type estimator, adjusted for 
nonresponse. 



l.h 



2.hl 



= first-stage design weight for the^ h^^ 
stratum (i.e.. superstratum) = p^^ 
the reciprocal of the probability of se- 
lecting a PSU from the h^^ stratum. 
= second-stage design weight for the /^^ 

PSU in the h^^ stratum = 4" the 



3. hik 



.hij 



'.hii 



reciprocal of the probability of selecting 
a segment from the y^^ PSU in the 
h^^ stratum. 
= third-stage design weight =the reciprocal 
of the probability of selecting a person 
in the subsample from eligible persons 
in the k^^ segment, PSU, fi^ 

stratum, 

= to;:al eligible persons, after subsampling, 
in the j^^ age-sex class in the i^^ 
PSU and h^^ stratum. 

= total examined eligible persons in the 
age-sex class in the PSU 
and /i*^ stratum. 



In addition to the adjustment factors indicated 
in the equation, another adjustment of H was 
api:lied to data collected in the first eight stands 
completed since 22 ^'regular" segments per PSU 
were originally selected and only 20 were used. 
The distribution of final estimation weights is 
shown in table N. 

Table N. Distribution of final estima- 
tion weights ior examined children 



Weight class 



Distribution 
of examined 
children 



Percent 



I, 000-1,999-- 
2,000-2,999-- 
3,000-3,999-- 
4,000-4,999-- 
5,000-5,999-- 
6,000-6,999-- 
7,000-8,999-- 
9,000-10,999- 

II, 000-14,279 




2.9 
38.8 
42.7 
11.8 
1.1 
0.9 
0.8 
0.6 
0.4 
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VARIANCE ESTIMATION 

Background 

Standai*d errors of estimates of parameters 
for the sample were estimated by means of the 
(balanced) half-sample replication technique, first 
adapted to use for large-scale surveys by Simmons 
and Losee, described in references 23-25,29-31, 
The reasons for the adoption of this method were 
both operational and theoretical. The following 
major characteristics of the survey suggested 
requirements that were largely or wholly met by 
the half-sample replication technique. 

1, Since the obtaining of data for any single 
sample child is costly, the sample size is nec- 
essarily limited. The obvious statistical objec- 
tive of maximum exploitation of the data is particu- 
larly meaningful in the context of the Health 
Examination Survey since an increase in sam- 
ple size has an immediate and consequential 
impact on costs. The Health Examination Sur- 
vey cannot afford, for example, overuse of com- 
monly employed ''upper limit" approximations to 
sampling errors as might be done with a large 
sample group. 

2, Because the sampling errors of most 
statistics are large enough to be meaningful 
in analysis and many are large enough to be 
critical to the analytical conclusions, a high 
degree of computational support for the re- 
searchers analyzing the material is indicated* 
Standard errors must be made available quickly 
so that a particular investigation, which fre- 
quently advances in stepwise fashion with the 
next analytical step depending on the results of 
the last, may proceed with reasonable speed, 

3. The complete algebraic formula for esti- 
mation of sampling errors for the survey de- 
sign is unknown. This is because of the nature 
and complexity of the design as described in the 
preceding sections. While the algebraic relation- 
ships are identifiable or capable of being devel- 
oped for particular subprocedures«-such as the 
use of cluster and multistage sampling within 
strata to reduce costs, the poststratification 
techniques used to reduce sampling variance, 
or the nonresponse adjustments to reduce 
•bias— a single, composite, estimating equation 
for the standard error of survey statistics 



cannot be developed. The use of the Goodman- 
Kish controlled selection technique as part of 
the selection process in itself precludes this, 
since, while it is known that such controlled 
selection should reduce the sampling variance, 
theory does not exist to permit algebraic quanti- 
fication of the extent of the reduction using only 
sample information. Even if controlled selection 
were eliminated as a definitive factor, the ex- 
treme complexity of the combination of the vari- 
ous other elements of the design would probably 
preclude, as a matter of practicality, direct 
algebraic estimation, 

4, In a large, multidimensional investigation, 
such as the National Health Examination Sur- 
vey of children, interest frequently centers on 
studies of characteristics of various population 
subgroups. The numbers of persons in these 
subgroups, or domains of study, are in them- 
selves random variables. Algebraic techniques 
for computation of standard errors of statistics 
relating to them have been developed by Cochran'^^ 
and others for certain restricted designs, all 
considerably less involved than the survey de- 
sign used for Cycle II. Their use, however, 
introduces some bias, considerable complexity, 
and formidable computational effort. 

Summary of Applicable Theory 

The population is classified into L strata, 
from each of which two sample PSU's are drawn, 
with equal probability within the stratum, but 
not necessarily across strata. The desideratum 
of selection of exactly two sample PSU's reflects 
an essential element of the theory and may be 
met by post facto "collapsing" of two strata 
from each of which only a single PSU has been 
drawn or by creating an artificial PSU by ran- 
dom methods from the operational PSU selected 
from such strata. 

Of analytical interest is parameter P for 
which an estimate, p, is to be obtained from the 
sample* The estimator, p, is a linear combina- 
tion of the sample observations in fully rig- 
orous developments, although, as will be seen 
later, this requirement may be compromised in 
applications with little practical effect. 
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A half-sample replicate is defined as the 
collection of L/2 PSU^s obtained by selecting 
one of the paired sample PSU^s from each 
stratum. (These may be referred to simply 
as replicates or half samples for brevity.) Des- 
ignating / = 1 , 2 as the subscript to identify 
the sample PSU*s within each stratum, h=l,2,3,...,^ 
to identify the strata, and a = 1 , 2 , 3 , . . . ,i4, where 
A > L as half-sample replicate identification, 
the pattern may be summarized as in table O 
where a indicates that a PSU falls into 

the particular half sample, and a indicates 
that it does not. 

Analogues of the linear estimator p cor- 
responding to each half sample are then com- 
puted. That is, for the a 'th half sample, is 
calculated by summing across strata. as: 
L 

= E % ^hl 
h=l 

where is the proportion of persons in the 
h'th stratum ^ ^ = ^ ),/ = either 1 or 2 
depending on which PSU of the stratum is in 

Table 0. HaLf^sampLe repLication forma* 
tion 



Stratum 



Half- 

samp Le 


1 


2 


3 




L 


repLica- 
tion 


PSU 


PSU 


PSU 




PSU 




L 


2 


L 


2 


L 


2 




L 


2 


1 












+ 


• • t' 


+ 




2 




+ 




+ 


+ 








+ 


3 




+ 


+ 






+ 




































• 


• 










• 


















• 


A 


+ 




+ 




+ 




• • • 


+ 





half-sample a and P^^. is, in this example, 
a mean, 

The estimator p calculated using all the 
information in the sample, is: 

L 



The variance of the estimator p is calcu- 
lated as: 



c:2 I V . 2 



A set of side conditions relatingto the se- 
lection of PSU's for development of the half 
samples has been developed by McCarthy," 
based on work by Plackett-Burmari^'WdGurneyJ 
The significance of this procedure is that greatly 
increased stability in the estimate is ob- 
tained by eliminating a between- strata contri- 
bution of variance otherwise present in calculating 

2 2 
Sp across half samples. The Sp calculated 

for a set of half samples formed according to 
the McCarthy criteria is numerically equal to 
the value which would be obtained if all 2^ pos- 
sible half samples had been formed. A set of 

half samples selected according to the McCarthy 
criteria is called a balanced set and the pro- 
cedure is referred to as balanced half-sample 
pseudoreplication. The Cycle II variance esti- 
mations are calculated by using balanced half- 
sample replication methods, and reference to 
the technique throughout this report implies a 
balanced pattern. 

Estimates of standard errors developed ac- 
cording to this technique have several highly 
desirable attributes, both in calculation and in 
concept. The more important are summarized 
by McCarthy" as: 

"Replicated sampling permits one to bypass 
the extremely complicated variance esti- 
mation formulas and the attendant heavy 
programming burdens. Variance estimates 
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based upon the replicated estimates will 
mirror the effects of ail aspects of sam- 
pling and estimation that are permitted to 
vary randomly from replicate to replicate. 
This of course includes the troublesome 
domain-of-study problem," 
The theory is completely rigorous only in 
the case in which the statistics for which stand- 
ard errors are being estimated are linear func- 
tions of the sample observations. Several em- 
pirical investigations indicate that use of certain 
ratio estimators and correlation statistics re- 
sults in a bias that is unimportant, if detectable 
at all. in an analytical context. Such 
bias is not considered to be of practical impor- 
tance in application of the replication method to 
Cycle II data, as described below. 

Application to Cycle 11 Data 

The starting point for Cycle II replication 
procedures is the set of 40 PSU's, one from 
each of the 40 HES superstrata as previously 
defined. Associated with each of these PSU's is 
a sampling fraction which is numerically equal 
to the probability of selection of the PSU al- 
though, as described in a preceding section, the 
actual mechanics of selection of the PSU involve 
application of the Goodman-Kish side conditions 
which are more complicated (and contribute more 
to reduction in sampling variation) than simple 
selection of the PSU with probability proportional 
to size. An example will clarify the way in which 
the weights associated with the sample PSU*s 
were computed. 

In the Northeastern Region, superstratum 
Ciii is composed of 11 HIS strata or FSU's with 
a combined 1960 census population of 3,759,516 
(table P). This HES stratum consisted of SMSA's 
of under 1,000,000 population in 1960 (C desig- 
nation), which contained the smaller SMSA's (iii 
designation) in this category. 

HES superstratum Ciii includes HIS stratum 
No. 211 which is in turn composed of two HIS 
PSU's: Portland, Maine, SMSA (I960 census popu- 
lation 120,655) and Atlantic City, N.J.. SMSA 
(1960 census population 160,880). 

Under the Goodman-Kish selection technique, 
HIS stratum No. 211 is selected from the 11 
HIS strata which constitute HES superstratum 



Ciii, The Portland, Maine, HIS sample i'SU which 
has already been drawn from HIS stratum No. 
211 for HIS purposes is then selected for HES 
purposes with probability 1 and is designated as 
the HES "stand," The numerical vaiue of the 
probability of selection of the Portland, Maine, 
stand in this case is; 

120.655 4 120.655 + 160.880 
120,655 + 160,880 3759,516 

although, as explained in a previous section, the 
actual (Goodman-Kish) selection procedure re-^ 
suiting in this probability is operationally dif- 
ferent from simple probability' proportional to 
size selection which might be (incorrectly) in- 
ferred from the above two fractions. The actual 
selection procedure is also conceptually different 
since the Goodman-Kish side conditions result 
in a smaller sampling variance. 

The stands, or examination locations, cor- - 
responding to the PSU^s thus selected are identi- 
fied in table P together with the HES superstrata 
with which they are associated. 

As stated previously, the balanced half-sam- 
ple replication theory is based upon selection of 
one sampling unit from a stratum containing ex- 
actly two such units. It was therefore necessary 
at this point to create HES artificial or "pseudo" 
strata from pairs of HES strata in order to make 
use of the half-sample replication model. Two 
procedures were used, depending on whether or 
not the defined HES strata were self-representing. 

For both self-representing (certainty) and 
non-self-representing (noncertainty) HES strata, 
strata were paired on the basis of (1) some 
subjective determination of the homogeneity of 
the population in which the primary consider- 
ations were population density, region, rate of 
growth, and industry and (2) concern that strata 
of approximately equal size would be paired. 
The latter has no theoretical or practical effect 
on variance computations in Cycle II since the 
factors necessary to adjust for unequal size of 
members of the pair were introduced into the 
weighting procedures specific for each replication 
(reference 22, page 285). The former is of con- 
cern, in that members of the pair may have 
markedly different characteristics with respect 
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Table P. Definition of HES pseudostrata for replication purposes 





1960 








Census of 




HES pseudo- 
stratum 
number 


HES superstratum 


Popula- 
tion of 
HES super- 


Region 




stratum 







Num- 
ber 



Stand 



Location 



Non-self -representing HES strata 



Bii ^ 

Bi 

Ci 

Cii 

Di 

Dii « 

Bii ' 

Biii 

Cii ' 

Di , 

Ciii 

Dii 

Bi 

Bii 

Di — . 

Dii 

Aiii 

Bii 

Dii • 

Dfii-rr-v"""V 

Aii---..J^-.— ^; 

Bi 

Ci - 

Ci 

Bi 

Biii 

Diii 

Di - 

Ciii 

Cii 

Ci 

Cii 



A Q 0 /, 7 0/1 
H , 774 , /JO 


JNc. 


Ul 


r 

5 


4,183,250 


NE 


01 


37 


3,759,760 


NE 


02 


38 


3,768,466 


NE 


02 


35 


4,271,826" 


NE 


03 


3 


4,843,253 


NE 


03 


36 


3,776,544 


S 


04 


40 


3,961,447 


S 


04 


9 


4,961,779 


s 


05 


27 


4,622,338 


s 


05 


39 


4,973,857 


s 


06 


25 


4,415,267 


s 


06 


34 


3,856,698 


m 


07 


33 


5,155,715 


MW 


07 


20 


4,507,428 


MW 


08 


32 


4,156,090 


MW 


08 


2 


3,890,572 


w 


09 


14 


4,899,898 


w 


09 


6 


5,519,588 


w 


10 


8 


5,115,227 


w 


10 


16 


4,318,307 


s 


11 


13 


3,587,125 


w 


11 


29 


4,895,507 


MW 


12 


24 


5,047,027 


w 


12 


26 


3,472,118 


s 


13 


30 


4,799,314 


MW 


13 


21 


4,384,792 


MW 


14 


22 


5,207,020 


W 


14 


18 


3,759,516 


N 


15 


1 


4,570,419 


MW 


15 


■ 4 


4,739,463 


s 


16 


11 


4,841,990 


w 


16 
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Boston, Mass, 
Newark, N.J . 
Jersey City, N.J, 
A lien town. Pa, 
Columbia -Dutchess , N. Y. 

(Poughkeepsie , N. Y. ) 
Ha rt ford -To 1 land , Conn . 

(Manchester and Bristol, 

Coun . ) 
Columbia, S. C. 
Charleston, S.C. 
Crittenden-Poinsett 

(Marked Tree, Ark. ) 
Sussex, (Georgetown, 

Del.) 

Bell-Knox-Whitley, Ky, 

(Barbourville) 
Breathitt-Lee, Ky. 

(West Liberty and 

Beattyville) 
Cleveland, Ohio 
Minneapolis-St. Paul^ Minn. 
Lapeer-St. Clair, Mich, 

(Lapeer and Marysville) 
Ashtabula -Geauga, Ohio 
San Francisco, Calif. 
Denver, Colo, 
Prowers, Colo, (Lamar) 
Mariposa, Calif. 
Atlanta, Ga, 
Houston, Tex. 
Des Moines, Iowa 
Wichita, Kans. 
Birmingham, Ala. 
Grand Rapids, Mich, 
Clark, Wis. (Neillsville) 
Grant, Wash. (Moses Lake) 
Portland, Maine 
Mahaska-Wapello , Iowa 

(Ottumwa) 
De Soto-Sarasota, Fla» 

(Sarasota) 
Brownsville, Tex, 

(Brownsville) 
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Table P. Definition of HES pseudostrata for replication purposes —Con, 



HES superstratum 


1960 
Census of 
Popula- 
tion of 
HES super- 
stratum 


Region 


HES pseudo- 
stratum 
number 


Stand 


, Num- 
ber 


Location 


Aiii 
Ai 

Ai 
Aii 

AijAii 
AijAii 

Ai,Aii 
AijAii 


Non-se 

3,728,920 

6,794,461 
3,762,360 

6,742,696 
10,694,633 


Lf -repre 

S 

MW 
MW 

W 
NE 


senting HES s 

A n 1 D 

UlA, Uib 
OlAjOlB 

02A,02B 
02A,02B 

03A,03B 
03A,03B 

04A,04B 
04A,04B 


trata 

7 & 15 

23 6c 31 

10 
12 

17 & 19 


Philadelphia, Pa,, and 
Baltimore, Md, 

Chicago, 111- , and 
Detroit, Mich. 

Los AngeleSj Calif. 
New York, N.Y. 



to a particular variable under study. To the extent 
that this is true then the expected value of the 
estimated standard error maybe positively biased 
to some extent. That is, as the subjective pooling 
of "collapsing" of straca becomes a compromising 
procedure, a more conservative estimate (i.e., 
overstatement) of the sampling variance is ob- 
tained (reference 22, page 283). Evaluation of 
this effect for Cycle II data suggests that any 
resulting overstatement of sampling variance is 
of trivial consequence in an analytical context. 

The specific pairing or ^'collapsing" pro- 
cedures used for Cycle 11 are indicated in 
table P, 

For self-representing strata, an additional 
procedure was followed to ensure homogeneity 
of populations. This is best described in terms 
of an example using the first two self-represent- 
ing superstrata identified in table P, After the 
pairing of HES superstrata Aiii (NE) and 
Ai(S), sample segments in the Philadelphia and 
Baltimore PSU's were selected in random ser-- 
pentine fashion so that HES Pseudo-PSU OlA, 
the population corresponding to half of the seg- 
ments, includes a randomly defined part of both 
the Philadelphia SMSA and Baltimore SMSA 



populations. This is, of course, also true for 
HES Pseudo-PSU GIB. These two Pseudo-FSU's 
constitute HES Pseudostratum 01, 

As indicated in table P, Los Angeles and 
New York are special cases in which a single 
HES psuedostratum was defined from a single 
HES superstratum, the usual procedure, of course, 
being the definition of a single HES psuedostratum 
from the two HES superstrata. They were, how- 
ever, subjected to the randomization process 
described in the preceding paragraph, even though 
the artificially defined "stands" for these areas 
had already been defined on the basis of randomly 
selected segments with no geographical clustering. 

For non-self-representing strata, the pseu- 
dostrata were defined on the basis of size and 
homogeneity of population as shown in table P. 

Having defined the 20 (artificial) pseudo- 
strata, each consisting of two PSU's, the bal- 
anced half-sample replication pattern following 
the Piackett-Burman techniques may be. applied. 
This was done, and 20 half-sample replications 
were formed according to the constraints de- 
veloped by Plackett-Burman, Each (half-sample) 
replication consisted of 20 sample PSU's, one 
being selected from each pseudostratum. 
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One additional ramification was undertaken 
before variance computations were made. This 
was the development and application of factors 
to adjust each individual replication to the (Census) 
independent control populations for 24 age-color- 
sex classes. For example, the combined sampling 
and nonresponse weights for 8-year-old white 
male children in replication four were adjusted 
so that the national estimate of all such chil- 
dren, using only the sample information con- 
tained in replication four, results in a figure of 
1,739,000— the independent Census estimate of 
this population as or /iugust L, 1964. In summary, 
each replication (which contains about half of 
the sample cases) results in an estimate which 
is numerically equal to the estimate obtained 
from the whole sample due to the application of 
these adjustment factors. While this reduces a 
small amount of bias of the estimated sampling 
variance, the process involves considerable work 
and insufficient evidence is available on which 
to base a decision as to whether or not it is 
worth the cost,'-^'^*'^^ Pending further methodo- 
logical investigations, a prudent approach was 
adopted for Cycle II data and the factors were 
applied as described. 

The only remaining step is the application of 
the theory stated earlier to produce the variance 
estimates. To avoid restatement of the theory, 
application will be noted in the form of an ex- 
ample, paralleling the theory presented earlier. 

Data from Cycle II show that the mean num- 
ber of upper arch permanent teeth among 8-year- 
old boys in families for which the annual family 
income is reported as between $5,000 and $6,999 
is 5,17, i,e., p=:5.l7 using the previous notation. 
For each of the 20 half- sample replicates, the 
analogue p is computed (table Q). The sam- 
pling variance of p is then estimated as 



these for this estimate of the mean number of 
upper arch permanent teeth are as follows: 



\ 

20 



20 



( p. - p ) 



=1 

=008545 



For analytical convenience several functions 
of the estimated sampling variance are then 
calculated and routinely displayed. The values of 



Mean upper arch permanent 

teeth - -- 

Standard error of mean 

Estimated population (denomi- 
nator) 

Standard error of denominator 
Estimated upper arch perma- 
nent teeth (numerator) 

Standard error of numerator 

Rel -variance of mean 

Rel -variance of denominator -- 

Rel -variance of numerator 

Sample frequency 



5.17 
.09 

437,000 
34,000 

2,258,000 
178,000 

.00032 

.00666 

.00625 

140 



A standard computer program is available 
whereby means, standard errors of means, sam- 
ple sizes, and the associated indexes of sampling 
variability are obtained for a cross-classification 
of about 300 cells with simple and routine .specifi- 
cations. Row percentages and rates wich asso- 
ciated statistics are also options. Replicate 
variance calculations are also programed for 
correlation and regression statistics, although 
at this writing, data processing restrictions limit 
use of this latter program to methodological in- 
vestigations rather than for routine analytical 
purposes. 

Table Q, Half -sample replicate estimates 
of mean number of upper arch permanent 
teeth for 8-year-old boys with family 
income of $5 , 000-$6 , 999 



Repli- 




Repli- 




cate 


Pa 


cate 


Pa 


number 




number 




1 


5, 1029 


11 


5,1899 


2 


5.0685 


12 


5,0066 


3 


5,1964 


13 


5,2291 


4 


5,2701 


14 


5,2074 


5 


5,1602 


15 


5,0424 


6 


5,2353 


16 


5,0260 


7 


5,1779 


17 


5,2465 


8 


5,2547 


18 


5.3713 


9 


5,1619 


19 


5.1005 


10 


5.1116 


. 20 


5,0737 
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APPENDIX I 
GLOSSARY OF TERMS 



Primary samt)ltng unit (PSU).^A geographic 
entity composed of one or more contiguous counties, 
or a stiindard metropolitan statistical area (SiMSA). I he 
3,10:\ counties and independent cities in the United States 
were grouped to form 1,891 PSU's. Details of how PSU's 
were formed are presented in the text of this report. 

Self-representing- PSU^s,— -Those PSb"s which 
cover an entire stratum. The 1,891 PSLVs were grouped 
into 357 HIS strata. Of these, 112 are composed of 
a single PSU and 245 contain more than one. Since 
one PSU was selected in the sample from each stratum, 
those strata containing only one PSU are self-rep- 
resenting and those containing more than one PSU are 
non-self-represen ting. 

First-stage units ^.FSC/'sA— With a few exceptions, 
an FSU is synonomous with an HIS stratum, con- 
sisting of the aggregate of PSU's, sample and non- 
sample, in the stratum, 

H£S superstratum. — Consists of one or more 
FSU's. For the Cycle II sample, 364 FSU's were 
grouped into 40 superstrata. Eight superstrata were 
self-representing, and 32 were non-self-representing. 



Pseudostratum,^ .\n artificial stratuin formed hy 
combining two superstrata, m* by combining ''random 
halfs" of superstrata. 1-xamples of the latter are 
two pseudostrata, each comprised about half of the 
population of the Philadelphia SMSA plus half of the 
population of the Baltimore SMSA, The pseudostrata 
were conceptual entities used in the estimation of 
variances by the half- sample repiicntion method. 
Twenty pseudostrata were defined, 16 from the com- 
bining of two superstrata* 

Standard metropolitan statistical area (SMSA), — A 
county or group of contiguous counties (except in New 
England) which contains at least one central city of 
50,000 people or more, or "twin cities" with a com- 
bined population of at least 50,000 population. In ad- 
dition, other contiguous counties are included in an 
SiMSA if. according to certain criteria, they are 
socially and economically integrated with the central 
city, A detailed explanation of a listing of the com- 
ponent areas of each SMSA is given in Bureau of the 
Budget, Standard Metropolitan Statistical Areas, 1967 
IZdition, 
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APPENDIX II 



PROCEDURE FOR FORMING AND STRATIFYING PSU's IN THE CURRENT 
POPULATION SURVEY AND THE HEALTH INTERVIEW SURVEY DESIGNS 



Formation of PSU*s 

vSeveral rules were followed in defining and forming 
P^lVs. They were: 

1. Each PSU should comprise one or more con- 
tiguous counties, PSU^s involving metropolitan 
counties were defined as consisting of whole 
SiMSAVs, except in New England where towns 
and cities rather than counties were used in 
defining SMSA^s, (For definition of SMSA, see 
appendix I.) 

2. PSU's should not cross regional lines, i.e., the 
four standard Census regions— Northeast, North 
Central, South, and West, However, it was not 
possible to follow this rule entirely as eight 
SiVlSA's crossed regional boundaries, 

3. The area of a PSU should not exceed 2,000 
sc-iare miles in the West Region and 1,500 
square miles in other regions, except in cases 
where a single county exceeds the maximum 
area, 

4, The 1960 population of each PSU should be at 
least 7,500 in the West Region and 10,000 in 
other regions, except in cases where this 
would require exceeding the specified maximum 
area, 

5, PSU's should he formed in such a way as to 
avoid extreme length in any direction, 

6, For situations in which more than one count^' 
was to be grouped to form a PSU, the princ- 
iple was to make the groups as heterogeneous 
as possible with respect to a number of vari- 
ables. The principal ones were economic area, 
principal' industry (used primarily in urban 
areas), value of agricultural products (used 
primarily in rural areas), and the proportion 
of the county^s population that was not white. 
The last item was used only in areas where 
there was appreciable variation between 
counties, primarily in the South, 



A more detailed description of the formation of 
the PSU's may be found in Bureau of the Census.' 
1 echnical Paper No. 7.-^ 

Stratification of PSU's 

The sample designs for Cl^S and HIS have changed 
several times since the surveys began, but in 1962 
when Cycle 11 wati designed both consisted of 357 strata 
and 357 sample PL^U's— one PSU from each stratum. 
In determining which PSU's should be grouped together 
to form a stratum, a number of factors were considered, 

1, Since only one PSU was to be selected from 
a stratum with a probability' proportional to 
a measure of size, each PSU wiih a popu- 
lation above a certain size was put into a 
separate stratum by itself. Those, PSU's are 
referred to as "self-representing," The popu- 
lation size cutoff for self-representing PSU's 
when most of the stratification woi'k was done 
in the 1950's was 400,000 according to the 
1950 population census. In some instances, 
however, a PSU with less than 400,000 people 
was classified as self-representing. These 
were smaller SMSA's within 100 miles of an 
SMSA with over 400,000 people. This was done 
since the field organizaton that served the 
larger city could also serve the smaller one 
and thus reduce survey costs. 

In 1962 when the HIS sample was redesigned, utilizing 
1960 census data, an additional criterion was in- 
troduced; namely, that PSU's with a population size 
greater than 75 percent of the national average for 
non-self-representing strata should also be self-rep- 
resenting. The result of this was that all PSU's with 
more than 242,000 population in 1960 were classed as 
self-representing. For the 357 strata, 112 PSU's were 
self-representing and 245 sample PSU's were not, 

2, Strata should be approximately the same size 
except where a single PSU was larger than 
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an average stratum. The average population 
for non-self-representing strata within regions 
ranged from 298.000 to 349,000 (table I). 

3. Strata containing more than one PSU would 
be as homogeneous as possible. Combining 
this with the principle for forming PSU's, a 
stratum should contain PSU^s which tend to be 
alike, but the ultimate sampling units within 
PSU's should be as unalike as possible. The 
basic modes of stratification were: 

SMSA or not 

Rate of population change, 1950 to L960 

Percent of population living in urban areas 

Percent of population in manufacturing 

Principal industries 

Average value of retail trade 

Proportion of population that was not white 

4. The geographic spread of PSU's for non-self- 

representing strata is restricted only by the 
four census regional boundries. That is, a 
stratum • might be composed of PSU^s located 
anywhere in a region but cannot contain non- 
self- representing PSU's located in different 
regions. Some effort was made, however, to 
combine PSU's located in the same Census 
division within regions. 



The first step in the stratification process was 
to allocate each PSU to one of three groups. All 
self-representing PSU^s were assigned to group 1; 
non-self-representing PSU^s located in areas of rel- 
atively high population density were put in group 2, 
and the remaining PSU^s were assigned to group 3» 
The next step was to classify groups 2 and 3 into 
three groups according to degree of urbanization. 
One subgroup contained SMSA's not classified in 
[proup 1, The other two subgroups were labeled 
"urban" and "rural," A PSU was considered rural if 
its rural farm population was 35 percent or more 
of the total, or if the rural farm population of the 
PSU was less than 35 percent but the population in 
urban places was less than the rural farm population 
and the rate of population increase was well below 
the average for the general area in which the PSU was 
located. After those two steps, stratification pro- 
ceeded with primary attention being given to rate of 
population increase, degree of urbanization, color, 
principal industry, and type of farming. After semi- 
final stratification was completed the results were 
reviewed, and a few subjective changes were made 
which reviewers thought would increase socioeconomic 
homogeneity between PSU's within strata. Thus 357 
strata were formed which have characteristics as 
shown in table I. 
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Table I, Number and average size of strata in the 357 area design by type of strata and region 



Region and type of strata 


Number 

of 
strata 


Average 

1960 
strata 
popula- 
tion 


SMSA 


Non-SMSA 


Number 

of 
strata 


Total 

1960 
popula- 
tion^ (in 

thou- 
sands ) 


Number 

of 
strata 


Total 

1960 
popula- 
tion (in 

thou- 
sands) 


Self -representing 


112 




107 


99,228 


5 


1,296 


28 
26 
36 
22 

245 


1,225-. 000 
1,013,000 
571,000 
882,000 

322,000 


23 
26 
36 
22 

39 


32,905 
26,346 
20,563 
19,415 

13,515 


5 

206 


1,296 
65,283 




Non-self-representing 


30 
76 
110 
29 


349,000 
333,000 
313,000 
298,000 


7 
13 
17 

2 


2,785 
4,614 
5,399 
717 


23 
63 
93 
27 


7,691 
20,659 
29,012 

7,921 







^Because of minor differences between HIS design and Census in what was treated as an SMSA, 
the total of SMSA population on the table is about 141,000 less than SMSA total according to i960 
Census of Population. 

'•^Includes Alaska and Hawaii. 
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APPENDIX III 
HOUSEHOLD QUESTIONNAIRE 



?«0 4M?5r AM^''7"^' J'."'^^^ '^"^^^''■'y nu.lK.ri/.L'J by Public I.iw 65: ot ih< H-Uh Coni,tvs, CO Sun. 
489, .4^ T- r' .^nV- All information which uculd pcrrmt iacntifu-.n Ion of tht- indivMu.iI ^sili bt- hcUl strictlv 
confiJejitial will bt- used only by pL-rsons cn^;a^Ld in and for tk* purposc-s of the survc-v .ind will n«i Uv Jis'- 
closcd or rcle«i«ed to others for nny other purposes f22 FK 1687). 



APPROVAL h:XPlRES JVLV 31. i96S 



FORM NHS.HES.2 



U.S. DEPARTMENT OF COMMERCE 
BUREAU OF THE CENSUS 
ACTING AS COLLECTrNC AGENT FOR THE 
U.S. PUBLIC HEALTH SERVICE 



NATIONAL HEALTH SURVEY 



1. (Jucstionnairtr 



of . 



2, (o) Address or description of location (tncludL- city, /one, jjnd State) 



2. (b) Mailing address if not shown In 2(a) OR Same as shown in 2(.i) 



2. {c) Name of special dwelling place 



L 



Ask items 8 and 9 only if "Rural" box i.s marked 
I □ Rural 2 other (Skip to iO; 



■ 8, Do you own or rent this piece? 



I □] Own 
(A ak 9(a)) 



2 □ Rent 
(Aak 9(b)) 



3 Rent free 
(A»k 9Ce)) 



9. (o) If Own or Rent free, ask - Does this ploce hove 10 
or more ocrei? 

(b) If Rent, ask - Does the ploee you rent hove 10 
Or more ceres? 



'OVes 



2[ZlNo 



(e) During the past 12 [ (d) During the pag| 12 



months did soles of 

erops, livestock, ond \ 

other form produels } 

itom the ploce omouni I 

to $50 or more? | 



months did soles of 
crops, llvestoek, ond 
other form products 
from the ploee omouni 
to S250 or more? 



I □ Yes 2 □ No I 1 □ Yes z Q No 



Questionnaires 



3. Identification 4. PSU 
code number 



E 



Sediment 
number 



6. Serial 
number 



If this questionnaire is for an "EXTRA" unit in a B or 
NTA Segment, enter; 



Serial No. of 

original 
Sample Unit 



Item No. by 
which found 



If in NTA Segment, also 
enter for FIRST unit 
listed on property 



Segment List 



Sheet No. 



L.ine No. 



7t Type of living; quarters (chvck one box) 
C] Housing unit Qtlifr unit 



ALL segments (ask if Item 2<a1 address identifies a SINGLE-UNI T structure). 
10. Afe there ony occupEod or vocont living quorlers BESIDES YOliP OWN-. 

■-In the boaement? Q Yes--S ^L [3] No 

--on this floor? [3] Yes--S _L ["] No 

--on any other floor 

of this building? Yes--S [~J No 

(Pil! 7*«iife X lor each quarters iVOT {fitted) 



ALL segments (ask if Item 2(a) identifies entire floor or unnumbered part of 
floor in a MULTI-UNIT structure). ^ 

11. Arc there ony occupied or* vacont living quorters BESIDES YOUR OWN-- 
If Item 2(a) identifies entire floor 
--on this floor? 

If Item 2Ca) identifies part of the floor, 



□ Yes-S_ 



.□No 



specify part 
--In the--of this floor? 



(Fttl Table X tor each quart era NOT tlated.) 



TA and NTA segments (ask at all units EXCEPT APARTWENT HOUSES). 

ir loVon??'' building on fhis property for people lo (iye in - either occupied 



□ Yes-S_ 



□ No 



(FItt Tabto X lor each quartera NOT Hated.) 



13. What Is the telephone number here? 



Telephone No. 



(INTERVIEWER): If eligible child in household enter child^s name, 
segment, serial^ and column number on Medical 
History Form. 

(READ TO RESPONDENT) 

In odditlon to the lnforR>o}]on you hove olreoJy given me, I would like 
to leave this form to be filled out obout-- . The form is self-explano- 
loiy. A represenlollve of the U.S. Publie Heollh Service will eome by 
lo pick up the form in o week or so. (Aak Itom 14) 



OR LJ No telephone 



K, Whol would be the best time of day for the 


Medical histories left foi-- 


Person with whom form left-- 


Column No(s]. 


Column No. and relationship 



15. RECORD OF CALLS AT HOUSEHOLD 
Item 



Entile household 



Date 



Time 



Com. 



j6. REA SON FOR NON-INTERVIEW 
TYPE 



Com. 



Com. 



I^easont 



□ Refusalr^'o«cri6« /n foottyotaa) 
I I No one at home-- ^ 

repeated calls L^^ 
\ \ Temporarily absent i 
I I Other tape city) ) 1 



\7. TYPE A FOLLOW-UP PROCEDURE 



I I Vacant-- non-seasonal 
I I Vacant-- seasonal 
n Usual residence elsewhere 
I \ Other (Specity) 



rn Demolished 
I I In sample by mistake 
C71 Eliminated in sub-san\ple 
I ) Other (SPecUy) 



If final call results in a Type A non-Interview (except Refusals) take the following stepsi 
I. Contact neighbors (caretakers, etc.) until you find someone who knows the family 
^' ^(li^^l «f*«n";!!i'"k''' P^'^P'tf ^" '^'^ household, their names and appraxin^ate ages; 
O ^-orgth is ^nrormV* 



Interview not obtoined for 

Cols. 

because: 



18. Signature of interviewer 



19. Code 



USCOMM.OC 22310 P*e9 



1 ALU 1 


J. (o) Whof'Tl rhenome of the head of this household? (Enter niit„c in firxt column.) 

(b) Whot ore thenonies of oil ofher persons who live here? (Lint nlj ptfrsona who iive harts.) 

(c) { hove lilted (Read names) i$ there onyone else stnyinci here now such os friends, relotives,, 

(d) Hove 1 missed onyone who usuol ly li ves here but is now - . Tompoforily in o hospitol? Yes (Liat) [ J No 

* ' Awoy oo business? Yes (List) j "] No 

- -On a visit or vocotion?. . . [^J Ye?; (List) Q"^ No 

(e) Do ony of the people in this household hove o home anywhere else? 

[ 1 Yes (Apply household membership rulen. i! not a honsohnld mnmber dololo) ] No (Leave on queminnntiire) 


L.i.s? fi;imc j 
First n.tinv 


2. How ore(is)- -reloted to the heod of the househol d? 

(Entet relatlonmhlp to head, for example: wife, daughter, stepson, grandson^ mother'in-law, pnrtrier, roomer's wife, etc.) 


Relationship 

HEAD 


3. Race (Mark one box for »aCh person) 


CI White r;iN.>;ro 
CJ Other 


4. Sex (Mark one box for each peraon) 


(□ Male f ~] Female 


5. (o) How old were you on your lost bidhdoy? 


Aj;e [31 Under 
1 year 


For each>child age 5-12 listed on the quest ionnaire, ask; 
(b) Whot i s the month, doy, ond yeor of • - 's birth? 
(Check with Queation 5(a) for cons latoncY) 


Month 


Day 


Year 


TO INTERVIEWER: Mark "EC" box for each fU^^iblf child (n>?c 6-1 1) listed on the «iUf.stionnaire. If no I-C, 
ask coverage questions on Pa^e I. 
NOTE: Questions 6—14 must he asked only of p;irffii(.'» 1 or i^u.)rdi;ifi(s) of IIC. If no piiicni or 
guardian is at home, rtrran^c to call baek wlun tii'.y will bt- home. 


1""! F.C [--]Not 


1 ASK FOR EC 1 


Ask only for £C (children 6—11 years of ape) 
6. Whot is the nome ond locotion of the school ^goes to? 
(o) Whot grode is ••in? 


j ] No school 


Name and lo 
Grn<]r 


Cat ion 


)R GARDIANS OF EC I 


7. Where were you born? 

(Check U.S. box or write Ift name of country) 


□ U.S. 

Fore ifjn country 


B. ArO you primorify right hondet^, primorily left bonded, or both'' 


[-jHiKht ni.^'^t 

I "] Both 


9. Whot is the highest grode you ottended in school? 
(Circle hliheat ^rade attendeu or mark "None") 
(If attended, ask): 
(o) Did you finish this grode (year)? 


ir.'i NoiR' 

Fli-m 1 :! 3 4 5 6 7 8 

\Uy,h .... 1 2 3 
Colle^;e I 2 3 4 

r::i Yc-s [tino 


1 ASK FOR PARENTS C 


10* Whot were you doing most of the post 3 months - working, keeping house, or doing something else? 
(If "Doing something else," ask): 

(o) Whot were you doing? fEnfer reply verbatim rt/if* auk 10(b)) 

(If "Keeping house" OR "Doing something else," ask): 

(If "Working" in 10 OR "Yes" in 10(b), aJik): 


[„.] *"fkin^: [_ ] Keeping? iiousv 
[ _] Something else 

['.□ Yes ITJ No 

CZ) i-iil)'iimc m P.'ifftifne 


11. Are you now morriedr widowed, divorced, or seporoted? 
(If "(Carried," ask): 

(o) Hove you(your husbond) been morried more thon once? 


133] Married | | Divorced 
(331 W'idowcd [331 ScpariUed 
[33] Yes r ] No 


[PARENTS ONLYJ 


12, Besides (Read names of children entered in Question 1) hove you and(or) your husbond(wife) over hod 
any other children? 

□ Y.es n No ["□No parent 
(If *'Yes." ask); 
(o) Whot ore their nomes? 

(b) How old is ••? (If now deceased enter date of birth) 

(c) Where does he(she) live now? (If now deceased enter "deceasfd") 


Name 


1 ALL EC HOUSEHOLDS j 


Pleose look ot this cord (Hand respondent MKS-2(a) card and pencil). 
13. Do any of the questions on thot cord opply to ony members of the fomily? Pteoxe mork "Yes" or "No" 
for eoch question. 


Statement j^o. 


(For each "Yes" marked, ask): t.,^-,. — p"] 1 

tm NOTii: If "l" marked, enter name 
(o) You hove checked,. Who wos this? institution. 




(b) When wos this? ' ' 




i4. Which of these {ncome groups repie&ents your total combined fomify income for the post 12 months, thot is, 
Yovr*M, your--*», ©tc? (Show Income Flash Cnrd HES-2(b).) Include fncome from oJJ sources, such os woget, 
solories, rents from property, Sociol Security, or retirement benefits, help from relotives, etc. 

(Go to Ooestion 15 on Pope 4) 


Group 
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Last name /'a^ 
Fir.st nnme 


Last name 
First name 


First lume 


l'"irst nniTic 


La.st name 

First narnc 


Relationship 


Relat ionship 


Kelationsiiip 


Fil.itioiiship 


Relationship 


[~] Other"' 


□ White p] Nc^;ro 
□ Other' 


□ While □ Nc-/;r.> 
["] Other 


[ ! White Negro 
[■; 1 Other" 


□ White □ Negro 
□ Other 


□ Mule □ Female 


n M"^'*-' [I] Female 


Q Male [ ] Female 


L "] f7j Female 


1 1 Male (□ Female 


A^;e Under 
1 year 


Age □ Under 
1 year 


Age [7;; Under 
1 ye.u 


□ Under 
1 year 


Age □ Under 
1 year 


Mnnth 


Day 


Year 


Month 


Day 


Year 


Month 


Day 


Year 


M»)nth 


Day 


Year 


Month 


Day 


Year 


[□ EC ri Not 
KC 


□ EC □ Not 


rn Hc r;::i Not 

liC 


r i I-:C [■ ] Nnt 
EC 


□ EC □ Not 
EC 


L~J No sclionl 


□j No school 


[ ] No school 


j ! No school 


( ' No .school 


Name and location 


Name and location 


Name and location 


Nanu- and loc.tlion 




Name and location 




Grade 




Grade 




Grade 




Grade 




Grade'j 




□ U.S. 


□ U.S. 


LJ iJ-S- 


nij.s. 


□ U.S. 




Foreign country 


Foreign country 


Fnreign country 


Foreign country 


Foreign country 


□ Jtight □ Left 
□ Hoth 


□ Hight □ Left 
□ Both 


□ HiKl't [TILH. 
□ l^oth 


i'7 1 f7"J 


□ Right □ Left 
□ Both 


1 1 None 
Elem. 1 2 3 4 5 6 7 8 
High... 12 3 4 
College 1 2 3 4 5* 


1 1 None 
Elcm. .. 12 3 4 5 6 7 8 
High.. . 12 3 4 
College 12 3 4 5* 

□ Yes (□ No 


[171 I^one 
Elem. . . 1 2 5 4 S 6 7 8 
High... 12 3 4 
College 1 2 5 4 5t 

[711 vts- i:iNo 


f3] None 
Klein. ..1 2 3 4 5 6 7 8 
High. . .12 3 4 
CloUege 1 2 3 4 5 + 

n Vcs □ No 


□ None 
Elcm... 1 2 3 4 5 6 7 8 
High... 12 3 4 
College 1 3 4 5* 
□ Yes □ No 


□[Working | 1 Keepinj; house 
1 1 Something else 

□ Yes □ No 

1 1 FulJ-tiinK □) Part-time 


□] Working □] Keeping hoa.st- 
1 1 Something" else 

□ Yes □ No ^ 
(□ Full-time □) Part-timt 


'[ J Working [^.l Keepin;: house 

L ' ] .S.>in*'[liinf; el.se 

□ Ves □ No 

1 i Full-time Q^] Part tinie 


[" ~j Working? [ ^ Keeping? house 
(77! ^nierhin^ else 

[71 VeS □ No 

[ ] Full-time □] Part-time 


□ Working? □ Keeping house 

1 1 !M)methcn>; else 

□ Yes □ No 

□) Full-time (□ Part-time 


1 1 Married [ ] Divorced 

( 1 Widowed □) Separated 
1 1 Yes 1 1 No 


□ Married □] Divorced 
□] Widowed ( 1 Separated 

□ Ye.s □ No 


□ Married f_'J Divorced 

1 1 Widowed 1 1 Separated 

□ Yes □ No 


j "] Married [^"j Divorced 
j 'J Widowed { ] Separated 
r.71 Vcs □ No 


1 1 Married (□ Divorced 
(□ Widowed (□ Separated 
□ Yes □No 



Age 



Prcr?ent whereabouts 



Name 



Relationship 



Year(s) 



Name of Insriiution 



Uroup 



Group 



Group 



Ciroup 



Group 
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15. Is any languoge olhcr than Engl ish spoken here in your home? 
iU "Vt's," ;i«k): 

Whot )an9UDge(s}? L;in>ju;i;;cf si spi)ken 



(Complete front page of questionnaire) 



Comnicncs: rincttidc hem any intotmution nhich migfit tte itxctul to t^v PUS rvpri>\entitti wlicn s/ie eoSts to pick up thtf Medical History Fotm.) 



TABLE X . LIVING QUARTERS DETERMINATIONS AT LISTED ADDRESS 



d 

i> 
c 

J 
(1) 


o 

t 
rt 
G 
G 
0 

tl» 
V 

a 
O* 

(2) 


Are thato 

(specify location) 
quarter* for 
more tfian one 
group of people? 


Location of unit 

(Examples: 
Basement, 
2nd floor^ etc.) 

(4) 


USE OF CHARACTERISTICS 


CLASSIFICATION 


IF HCT in B SEGMENT, ASK 


Occupied 


All Quarters 
Do these(Speelfy Iocs, 
tlon) quarter! hove: 


Not a 
sepa- 
rate 
unit 
(Add 
occu* 
panla 
lo this 

lion' 
nalr«) 

(8) 


Fili 

separate 


In whot year 
were iheie 

(Specify location) 

qt/arters 

Created? 

<H 19S9 or t9€0, 

atao apecUy**F" 

it tint halt or 

"i-** // taat 

halt) 

(10) 


(1/ before July ]9<50) 

What wa« the name of 
the household head 
of theso quarter! on 
April 1, 1960? 

ClI) 


Do the occu- 
pants of 
these (Specify 
location) 
qiKirtara live 
□ nd cat with 
any other 
group of 
people? 


Yes 

(Fin one 
J/na tor 
each 
group) 

(3a) 


No 
<3b) 


Dirttct ac- 
c»%% from 
lh« outside 
Or tttrou^h 
o common 
hall? 


X VUchan 
or cooVing 
■(^utpment 
(or exclu* 
tfve use? 


naire 

and 

inter 


lon- 
»iew 


Yes 
(5a) 


No 
(5M 


Yes 

(6a) 


No 
(6b) 


Yes 
(7a) 


No 
(7b) 


HU 

(9a) 


Other 
unic 
(9b) 


1 
































2 
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VITAL AND HEALTH STATISTICS PUBLICATION SERIES 

i'omicrly Public Ihwltli Service Publicdti^^n So. 1000 

Series J. Proip^ams ami colleclioii /)rocav/i/rc?5\ — Kep()ttrf whiclulcscribc ihe general progranis of ihc National 
Center for Health Statistics and its offices and divisions, data C(^llection methods used, definitions, 
and other material necessary for understanding the data. 

Series 2. Da (a emluation and me I hods research. — Studies of new statistical methodology including: experi- 
mental tests of new survey methods, studies of vital statistics collection methods, new analytical 
techniques, objective evaluations of reliability of collected data, contributions to statistical theory. 

Series 3. AmlvUcal studies .~-\<epons presenting analytical or interpretive studies based on vital and health 
statistics, carrying the analysis further than the expository types of reports in the other series. 

Series 4, Documents and committee reports, — \'in-d\ reports of major committees concerned with vital and 
health statistics, and documents such as recommended model vital registration laws and revised 
birth and death certificates. 

Series JO. Daki Jrom the Health Inlervieio Swriiev.— Statistics on illness, accidental injuries, disability, use 
of hospital, medical, dental, and other services, and other health -related topics, based on data 
collected in a continuing national household interview survey. 

Series 11, Data from the. Health Examination Survey,— Wwa from direct examination, testing, and measure- 
ment of national samples of the civilian, noninstitutional population provide the basis for two types 
of reports: (I) estimates of the medically defined prevalence of specific dis erases in the United 
States and the distributions of the population with respect to physical, phyi ' /logical, and psycho- 
logical characteristics; and (2) analysis of relationships among the various measurements without 
reference to an explicit finite universe of persons. 

Series 12. Data from the Institutional Population Surveys —Statistics relating to the health characteristics of 
persons in institutions, and their medical, nursing, and personal care received, based on national 
samples of establishments providing these services and samples of the residents or patients. 

Series 13, Data from the Hospital Discharge Sw>'i;ey.— Statistics relating to discharged patients in short-stay 
hospitals, based on a sample of patient records in a national sample of hospitals. 

Series 14, Data on health resources: manpoivefr and facilities,— i^imisiics on the numbers, geographic distri- 
bution, and characteristics of health resources including physicians, dentists, nurses, other health 
occupations, hospitals, nursing homes, and outpatient facilities. 

Series 20, Data on mortality,— Various statistics on nmrtality other than as included in regular annual or 
monthiy reports—special analyses by cause of death, age, and other demographic variables, also 
geographic and time series analyses. 

Series 21, Data on natality, marriage, and divo7'ce, — War\ous statistics on natality, marriage, and divorce 
other than as included in regular annual or monthly reports— ^jpecial analyses by demographic 
variables, also geographic and time series analyses, studies of fertility. 

Series 22, Data from the National Natality and Mortality Swryeys. — Statistics on characteristics of births 
and deaths not available from the vital records, based on sample surveys stemming from these 
records, including such topics as mortality by socioeconomic class, hospital experience in the 
last year of life, medical care during pregnancy, health insurance coverage, etc. 



